Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps

Jonas Waldendorf, Bashar Awwad Shiekh Hasan, Evgenii Tsymbalov


Abstract
Hallucinations in Speech Large Language Models (SpeechLLMs) pose significant risks, yet existing detection methods typically rely on gold-standard outputs that are costly or impractical to obtain. Moreover, hallucination detection methods developed for text-based LLMs do not directly capture audio-specific signals. We investigate four attention-derived metrics: AudioRatio, AudioConsistency, AudioEntropy, and TextEntropy, designed to capture pathological attention patterns associated with hallucination, and train lightweight logistic regression classifiers on these features for efficient inference-time detection. Across automatic speech recognition and speech-to-text translation tasks, evaluations on Qwen-2-Audio and Voxtral-3B show that our approach outperforms uncertainty-based and prior attention-based baselines on in-domain data, achieving improvements of up to +0.23 PR-AUC, and generalises to out-of-domain ASR settings. We further find that strong performance can be achieved with approximately 100 attention heads, improving out-of-domain generalisation compared to using all heads. While effectiveness is model-dependent and task-specific training is required, our results demonstrate that attention patterns provide a valuable tool for hallucination detection in SpeechLLMs
Anthology ID:
2026.findings-acl.2147
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
43276–43289
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2147/
DOI:
Bibkey:
Cite (ACL):
Jonas Waldendorf, Bashar Awwad Shiekh Hasan, and Evgenii Tsymbalov. 2026. Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps. In Findings of the Association for Computational Linguistics: ACL 2026, pages 43276–43289, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps (Waldendorf et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2147.pdf
Checklist:
 2026.findings-acl.2147.checklist.pdf