Principled Detection of Hallucinations in Large Language Models via Multiple Testing

Jiawei Li; Akshayaa Magesh; Venugopal Veeravalli

Principled Detection of Hallucinations in Large Language Models via Multiple Testing

Jiawei Li, Akshayaa Magesh, Venugopal Veeravalli

Abstract

While Large Language Models (LLMs) have emerged as powerful foundational models to solve a variety of tasks, they have also been shown to be prone to hallucinations, i.e., generating responses that sound confident but are actually incorrect or even nonsensical. Existing hallucination detectors propose a wide range of empirical scoring rules, but their performance varies across models and datasets, and it is hard to determine which ones to rely on in practice or to treat as a reliable detector. In this work, we formulate the problem of detecting hallucinations as a hypothesis testing problem and draw parallels with the problem of out-of-distribution detection in machine learning models. We then propose a multiple-testing-inspired method that systematically aggregates multiple evaluation scores via conformal p-values, enabling calibrated detection with controlled false alarm rate. Extensive experiments across diverse models and datasets validate the robustness of our approach against state-of-the-art methods.

Anthology ID:: 2026.findings-acl.1705
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 34132–34145
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1705/
DOI:
Bibkey:
Cite (ACL):: Jiawei Li, Akshayaa Magesh, and Venugopal Veeravalli. 2026. Principled Detection of Hallucinations in Large Language Models via Multiple Testing. In Findings of the Association for Computational Linguistics: ACL 2026, pages 34132–34145, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Principled Detection of Hallucinations in Large Language Models via Multiple Testing (Li et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1705.pdf
Checklist:: 2026.findings-acl.1705.checklist.pdf

PDF Cite Search Checklist Fix data