Improving Medical Hallucination Detection with System Combination and Rule-based Customization

Jonathan Lasko, Damianos Karakos, Francis Keith


Abstract
The presence of factuality errors (hallucinations) in the outputs of patient-facing medical chatbots is a serious problem: they can lead to patient harm and erode people’s trust in the medical profession. For this reason, it is crucial to detect hallucinations in chatbot outputs and forward them to clinicians for review. In this paper, we present the system we built for detecting such errors: it consists of multiple LLM-powered detectors which are combined together with a novel alignment procedure. We ran our system on the MedExpert-Benchmark dataset (Yarmohammadi et al., 2025) and our results on two use cases, Mental Health and Prenatal Care, show that the combined system gives nice gains over the individual systems. Additionally, we show that further customization of the system to each one of the use cases leads to further gains, but at the cost of reduced generalizability. Our code and dataset are available here: https://github.com/BBN-E/medic-customnlp4u.
Anthology ID:
2026.customnlp4u-1.14
Volume:
Proceedings of the Second Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual (CustomNLP4U)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Sheshera Mysore, Sachin Kumar, Vidhisha Balachandran, Shirley Anugrah Hayati, Faeze Brahman, Hanane Nour Moussa, Alireza Salemi
Venues:
CustomNLP4U | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
160–166
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.customnlp4u-1.14/
DOI:
Bibkey:
Cite (ACL):
Jonathan Lasko, Damianos Karakos, and Francis Keith. 2026. Improving Medical Hallucination Detection with System Combination and Rule-based Customization. In Proceedings of the Second Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual (CustomNLP4U), pages 160–166, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Improving Medical Hallucination Detection with System Combination and Rule-based Customization (Lasko et al., CustomNLP4U 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.customnlp4u-1.14.pdf