Toward Reliable Clinical Coding with Language Models: Verification and Lightweight Adaptation

Moy Yuan; Han-Chin Shing; Mitch Strong; Chaitanya Shivade

Toward Reliable Clinical Coding with Language Models: Verification and Lightweight Adaptation

Moy Yuan, Han-Chin Shing, Mitch Strong, Chaitanya Shivade

Abstract

Accurate clinical coding is essential for healthcare documentation, billing, and decision-making. While prior work shows that off-the-shelf LLMs struggle with this task, evaluations based on exact match metrics often overlook errors where predicted codes are hierarchically close but incorrect. Our analysis reveals that such hierarchical misalignments account for a substantial portion of LLM failures. We show that lightweight interventions, including prompt engineering and small-scale fine-tuning, can improve accuracy without the computational overhead of search-based methods. To address hierarchically near-miss errors, we introduce clinical code verification as both a standalone task and a pipeline component. To mitigate the limitations in existing datasets, such as incomplete evidence and inpatient bias in MIMIC, we release an expert double-annotated benchmark of outpatient clinical notes with ICD-10 codes. Our results highlight verification as an effective and reliable step toward improving LLM-based medical coding.

Anthology ID:: 2025.emnlp-industry.12
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2025
Address:: Suzhou (China)
Editors:: Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 173–184
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.12/
DOI:
Bibkey:
Cite (ACL):: Moy Yuan, Han-Chin Shing, Mitch Strong, and Chaitanya Shivade. 2025. Toward Reliable Clinical Coding with Language Models: Verification and Lightweight Adaptation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 173–184, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):: Toward Reliable Clinical Coding with Language Models: Verification and Lightweight Adaptation (Yuan et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.12.pdf

PDF Cite Search Fix data