Ian Poey


2025

pdf bib
RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation
Ian Poey | Jiajun Li1 | Qishuai Zhong
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track

Real-time identification of out-of-context outputs from large language models (LLMs) is crucial for enterprises to safely adopt retrieval augmented generation (RAG) systems. In this work, we develop lightweight models capable of detecting when LLM-generated text deviates from retrieved source documents semantically. We compare their performance against open-source alternatives on data from credit policy and sustainability reports used in the banking industry. The fine-tuned DeBERTa model stands out for its superior performance, speed, and simplicity, as it requires no additional preprocessing or feature engineering. While recent research often prioritises state-of-the-art accuracy through fine-tuned generative LLMs and complex training pipelines, we demonstrate how detection models are deployed efficiently with high speed and minimal resource usage.