MARCH: Multi-Agent Reinforced Check for Hallucination

Zhuo Li, Yupeng Zhang, Pengyu Cheng, Jiajun Song, Mengyu Zhou, Hao Li, Shujie Hu, Yu Qin, Erchao.zec, Xiaoxi Jiang, Guanjunjiang


Abstract
Hallucination remains a critical bottleneck for large language models (LLMs), undermining their reliability in real-world applications, especially in Retrieval-Augmented Generation (RAG) systems. While existing hallucination detection methods employ LLM-as-a-judge to verify LLM outputs against retrieved evidence, they suffer from inherent *confirmation bias*, where the verifier inadvertently reproduces the errors of the original generation. To address this, we introduce **M**ulti-**A**gent **R**einforced self-**C**heck for **H**allucination (MARCH), a framework that enforces rigorous factual alignment by leveraging deliberate *information asymmetry*. MARCH orchestrates a collaborative pipeline of three specialized agents: a Solver, a Proposer, and a Checker. The Solver generates an initial RAG response, which the Proposer decomposes into claim-level verifiable atomic propositions. Crucially, the Checker validates these propositions against retrieved evidence in isolation, deprived of the Solver’s original output. This well-crafted information asymmetry scheme breaks the cycle of self-confirmation bias. By training this pipeline with multi-agent reinforcement learning (MARL), we enable the agents to co-evolve and optimize factual adherence. Extensive experiments across hallucination benchmarks demonstrate that MARCH substantially reduces hallucination rates. Notably, an 8B-parameter LLM equipped with MARCH achieves performance competitive with powerful closed-source models. MARCH paves a scalable path for factual self-improvement of LLMs through co-evolution. The code is at https://github.com/Qwen-Applications/MARCH.
Anthology ID:
2026.acl-long.1828
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
39389–39415
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1828/
DOI:
Bibkey:
Cite (ACL):
Zhuo Li, Yupeng Zhang, Pengyu Cheng, Jiajun Song, Mengyu Zhou, Hao Li, Shujie Hu, Yu Qin, Erchao.zec, Xiaoxi Jiang, and Guanjunjiang. 2026. MARCH: Multi-Agent Reinforced Check for Hallucination. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 39389–39415, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
MARCH: Multi-Agent Reinforced Check for Hallucination (Li et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1828.pdf
Checklist:
 2026.acl-long.1828.checklist.pdf