CliniCAST: Benchmarking Acoustic Grounding and Text Dominance in Medical Triage

Kyusik Kim, Hyunwoo Yoo, Jaehoon Choi, Kitae Kim, Gail Rosen, Bongwon Suh


Abstract
Recent Large Audio-Language Models (LALMs) integrate acoustic capabilities into reasoning, yet whether they reliably ground clinical judgments in audible evidence remains unproven. We introduce CliniCAST (Clinical Controlled Acoustic Synthetic Triage), a controlled benchmark that disentangles clinically meaningful acoustic cues from lexical content and speaker demographics. CliniCAST comprises 5,856 synthetic samples across 12 disease conditions: 4,800 audio samples forming 2,400 tagged–untagged pairs for five-level emergency triage, and 1,056 audio–text inconsistent samples in which reassuring speech is paired with high-risk acoustic cues. Evaluating a diverse suite of audio-capable foundation models, we find that LALMs exhibit fragile acoustic grounding and a pronounced “text dominance” failure mode: reassuring lexical content suppresses response to audible distress signals even under safety-critical conditions. Age and gender interactions are weak across conditions, indicating that the primary failure mode is insufficient cross-modal integration rather than demographic bias. These results suggest current LALMs are not yet robust enough for high-stakes medical triage, and motivate training objectives that explicitly enforce reliance on clinically grounded audible evidence.
Anthology ID:
2026.findings-acl.2056
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
41321–41343
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2056/
DOI:
Bibkey:
Cite (ACL):
Kyusik Kim, Hyunwoo Yoo, Jaehoon Choi, Kitae Kim, Gail Rosen, and Bongwon Suh. 2026. CliniCAST: Benchmarking Acoustic Grounding and Text Dominance in Medical Triage. In Findings of the Association for Computational Linguistics: ACL 2026, pages 41321–41343, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
CliniCAST: Benchmarking Acoustic Grounding and Text Dominance in Medical Triage (Kim et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2056.pdf
Checklist:
 2026.findings-acl.2056.checklist.pdf