RLSeek: Evidence-Grounded Reasoning for RAG Hallucination Detection

Zhaoheng Huang; Dacheng Wen; Yutao Zhu (朱余韬); Xiaoying Lian; Yushi Liang; Kai Hao; Nan Li; Liangjie Zhang; Qi Zhang; Ji-Rong Wen; Zhicheng Dou (窦志成); Fangzhao Wu

RLSeek: Evidence-Grounded Reasoning for RAG Hallucination Detection

Zhaoheng Huang, Dacheng Wen, Yutao Zhu, Xiaoying Lian, Yushi Liang, Kai Hao, Nan Li, Liangjie Zhang, Qi Zhang, Ji-Rong Wen, Zhicheng Dou, Fangzhao Wu

Abstract

Large language models (LLMs) in retrieval-augmented generation systems can still produce hallucinations, generating content that is unsupported or contradicted by the source texts and undermines reliability. Recent work addressed this problem by training span-level hallucination detectors using reinforcement learning (RL) and chain-of-thought (CoT) reasoning. In this work, we show through error analysis that incorrect predictions by existing reasoning-based detectors are strongly associated with CoT processes that lack explicit grounding in source evidence, particularly when verification steps do not quote or verify claims against the retrieved documents. This behaviour contrasts with human verification practices in benchmarks such as RAGTruth, where evidence quotation is a prerequisite for determining hallucinated spans. Motivated by this observation, we propose an evidence-grounded RL framework, namely RLSeek, to explicitly enforce active evidence seeking during CoT reasoning by requiring quotation of relevant source segments at each verification step. Experiments on the RAGTruth and NewsSum dataset demonstrate consistent improvements in hallucination span detection performance, with limited additional reasoning overhead and improved robustness in out-of-domain settings.

Anthology ID:: 2026.acl-long.1492
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 32329–32347
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1492/
DOI:
Bibkey:
Cite (ACL):: Zhaoheng Huang, Dacheng Wen, Yutao Zhu, Xiaoying Lian, Yushi Liang, Kai Hao, Nan Li, Liangjie Zhang, Qi Zhang, Ji-Rong Wen, Zhicheng Dou, and Fangzhao Wu. 2026. RLSeek: Evidence-Grounded Reasoning for RAG Hallucination Detection. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 32329–32347, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: RLSeek: Evidence-Grounded Reasoning for RAG Hallucination Detection (Huang et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1492.pdf
Checklist:: 2026.acl-long.1492.checklist.pdf

PDF Cite Search Checklist Fix data