Towards Conditioning Clinical Text Generation for User Control

Osman Alperen Koraş, Rabi Bahnan, Jens Kleesiek, Amin Dada


Abstract
Deploying natural language generation systems in clinical settings remains challenging despite advances in Large Language Models (LLMs), which continue to exhibit hallucinations and factual inconsistencies, necessitating human oversight. This paper explores automated dataset augmentation using LLMs as human proxies to condition LLMs for clinician control without increasing cognitive workload. On the BioNLP ACL’24 Discharge Me! Shared Task, we achieve new state-of-the-art results with simpler methods than prior submissions through more efficient training, yielding a 9% relative improvement without augmented training and up to 34% with dataset augmentation. Preliminary human evaluation further supports the effectiveness of our approach, highlighting the potential of augmenting clinical text generation for control to enhance relevance, accuracy, and factual consistency.
Anthology ID:
2025.findings-acl.549
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10549–10569
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.findings-acl.549/
DOI:
Bibkey:
Cite (ACL):
Osman Alperen Koraş, Rabi Bahnan, Jens Kleesiek, and Amin Dada. 2025. Towards Conditioning Clinical Text Generation for User Control. In Findings of the Association for Computational Linguistics: ACL 2025, pages 10549–10569, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Towards Conditioning Clinical Text Generation for User Control (Koraş et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.findings-acl.549.pdf