Luna: A Lightweight Evaluation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost

Masha Belyi; Robert Friel; Shuai Shao; Atindriyo Sanyal

Luna: A Lightweight Evaluation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost

Masha Belyi, Robert Friel, Shuai Shao, Atindriyo Sanyal

Abstract

Retriever-Augmented Generation (RAG) systems have become pivotal in enhancing the capabilities of language models by incorporating external knowledge retrieval mechanisms. However, a significant challenge in deploying these systems in industry applications is the detection and mitigation of hallucinations - instances where the model generates information that is not grounded in the retrieved context. Addressing this issue is crucial for ensuring the reliability and accuracy of responses generated by large language models (LLMs) in industry settings. Current hallucination detection techniques fail to deliver accuracy, low latency, and low cost simultaneously. We introduce Luna: a DeBERTA-large encoder, fine-tuned for hallucination detection in RAG settings. We demonstrate that Luna outperforms GPT-3.5 and commercial evaluation frameworks on the hallucination detection task, with 97% and 91% reduction in cost and latency, respectively. Luna’s generalization capacity across multiple industry verticals and out-of-domain data makes it a strong candidate for guardrailing industry LLM applications.

Anthology ID:: 2025.coling-industry.34
Volume:: Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert, Kareem Darwish, Apoorv Agarwal
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 398–409
Language:
URL:: https://preview.aclanthology.org/add-emnlp-2024-awards/2025.coling-industry.34/
DOI:
Bibkey:
Cite (ACL):: Masha Belyi, Robert Friel, Shuai Shao, and Atindriyo Sanyal. 2025. Luna: A Lightweight Evaluation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost. In Proceedings of the 31st International Conference on Computational Linguistics: Industry Track, pages 398–409, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Luna: A Lightweight Evaluation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost (Belyi et al., COLING 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/add-emnlp-2024-awards/2025.coling-industry.34.pdf

PDF Cite Search Fix data