Preference Learning Unlocks LLMs’ Psycho-Counseling Skills

Mian Zhang; Shaun M. Eack; Zhiyu Chen

Preference Learning Unlocks LLMs’ Psycho-Counseling Skills

Abstract

Applying large language models (LLMs) to assist in psycho-counseling is an emerging and meaningful approach, driven by the significant gap between patient needs and the availability of mental health support. However, current LLMs struggle to consistently provide effective responses to client speeches, largely due to the lack of supervision from high-quality real psycho-counseling data, whose content is typically inaccessible due to client privacy concerns. Furthermore, the quality of therapists’ responses in available sessions can vary significantly based on their professional training and experience. Assessing the quality of therapists’ responses remains an open challenge. We address these challenges by first proposing a set of professional and comprehensive principles to evaluate therapists’ responses to client speeches. Using these principles, we create a **Psy**cho-**Co**unseling **Pref**erence dataset, **PsyCoPref**, which contains 36k high-quality preference comparison pairs. This dataset aligns with the preferences of professional psychotherapists, providing a robust foundation for evaluating and improving LLMs in psycho-counseling. Experiments on reward modeling and preference learning demonstrate that PsyCoPref is an excellent resource for LLMs to acquire essential skills for responding to clients in a counseling session. Our best-aligned model achieves an impressive win rate of 87% against GPT-4o.

Anthology ID:: 2026.findings-acl.521
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10729–10750
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.521/
DOI:
Bibkey:
Cite (ACL):: Mian Zhang, Shaun M. Eack, and Zhiyu Chen. 2026. Preference Learning Unlocks LLMs’ Psycho-Counseling Skills. In Findings of the Association for Computational Linguistics: ACL 2026, pages 10729–10750, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Preference Learning Unlocks LLMs’ Psycho-Counseling Skills (Zhang et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.521.pdf
Checklist:: 2026.findings-acl.521.checklist.pdf

PDF Cite Search Checklist Fix data