ReFL: Reflective Feedback Learning for Hallucination Detection of Large Language Models

Cunhang Fan; Jun Zhang; Xue Zhang; Shuai Zhang; Zhao Lv; Jianhua Tao; Zhengqi Wen

ReFL: Reflective Feedback Learning for Hallucination Detection of Large Language Models

Cunhang Fan, Jun Zhang, Xue Zhang, Shuai Zhang, Zhao Lv, Jianhua Tao, Zhengqi Wen

Abstract

Large Language Models (LLMs) often generate factually incorrect content, known as “hallucinations”, which undermine the reliability and safety of their outputs. Existing hallucination detection methods either depend on external knowledge sources, incurring high computational costs and limiting real-time applicability, or extract the model’s internal states, leading to poor generalization. To address these issues, this paper proposes ReFL, a hallucination detection framework. ReFL leverages corrective in-context learning to dynamically guide LLMs to recognize their own prediction errors and adjust internal representations, critically without updating model weights. Specifically, by introducing a corrective in-context learning strategy, where triplets of input text, model prediction, and ground-truth label are embedded into the prompt to make the model explicitly aware of its own errors. The model reflects on prior outputs to adjust its internal states and generate semantically structured representations better aligned with factuality. This feedback mechanism encourages the model to shape a more coherent semantic space and enhances the LLM’s internal sensitivity to hallucinations. Experimental results on two benchmark datasets demonstrate that ReFL consistently outperforms existing methods, achieving state-of-the-art performance.

Anthology ID:: 2026.acl-long.899
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19648–19665
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.899/
DOI:
Bibkey:
Cite (ACL):: Cunhang Fan, Jun Zhang, Xue Zhang, Shuai Zhang, Zhao Lv, Jianhua Tao, and Zhengqi Wen. 2026. ReFL: Reflective Feedback Learning for Hallucination Detection of Large Language Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 19648–19665, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: ReFL: Reflective Feedback Learning for Hallucination Detection of Large Language Models (Fan et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.899.pdf
Checklist:: 2026.acl-long.899.checklist.pdf

PDF Cite Search Checklist Fix data