Benchmarking Large Language Models on Communicative Medical Coaching: A Dataset and a Novel System

Hengguan Huang, Songtao Wang, Hongfu Liu, Hao Wang, Ye Wang


Abstract
Traditional applications of natural language processing (NLP) in healthcare have predominantly focused on patient-centered services, enhancing patient interactions and care delivery, such as through medical dialogue systems. However, the potential of NLP to benefit inexperienced doctors, particularly in areas such as communicative medical coaching, remains largely unexplored. We introduce “ChatCoach”, a human-AI cooperative framework designed to assist medical learners in practicing their communication skills during patient consultations. ChatCoach differentiates itself from conventional dialogue systems by offering a simulated environment where medical learners can practice dialogues with a patient agent, while a coach agent provides immediate, structured feedback. This is facilitated by our proposed Generalized Chain-of-Thought (GCoT) approach, which fosters the generation of structured feedback and enhances the utilization of external knowledge sources. Additionally, we have developed a dataset specifically for evaluating Large Language Models (LLMs) within the ChatCoach framework on communicative medical coaching tasks. Our empirical results validate the effectiveness of ChatCoach.
Anthology ID:
2024.findings-acl.94
Volume:
Findings of the Association for Computational Linguistics: ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1624–1637
Language:
URL:
https://aclanthology.org/2024.findings-acl.94
DOI:
10.18653/v1/2024.findings-acl.94
Bibkey:
Cite (ACL):
Hengguan Huang, Songtao Wang, Hongfu Liu, Hao Wang, and Ye Wang. 2024. Benchmarking Large Language Models on Communicative Medical Coaching: A Dataset and a Novel System. In Findings of the Association for Computational Linguistics: ACL 2024, pages 1624–1637, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Benchmarking Large Language Models on Communicative Medical Coaching: A Dataset and a Novel System (Huang et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/2024.findings-acl.94.pdf