Jiajun Song
2026
MARCH: Multi-Agent Reinforced Check for Hallucination
Zhuo Li | Yupeng Zhang | Pengyu Cheng | Jiajun Song | Mengyu Zhou | Hao Li | Shujie Hu | Yu Qin | Erchao.zec | Xiaoxi Jiang | Guanjunjiang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhuo Li | Yupeng Zhang | Pengyu Cheng | Jiajun Song | Mengyu Zhou | Hao Li | Shujie Hu | Yu Qin | Erchao.zec | Xiaoxi Jiang | Guanjunjiang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hallucination remains a critical bottleneck for large language models (LLMs), undermining their reliability in real-world applications, especially in Retrieval-Augmented Generation (RAG) systems. While existing hallucination detection methods employ LLM-as-a-judge to verify LLM outputs against retrieved evidence, they suffer from inherent *confirmation bias*, where the verifier inadvertently reproduces the errors of the original generation. To address this, we introduce **M**ulti-**A**gent **R**einforced self-**C**heck for **H**allucination (MARCH), a framework that enforces rigorous factual alignment by leveraging deliberate *information asymmetry*. MARCH orchestrates a collaborative pipeline of three specialized agents: a Solver, a Proposer, and a Checker. The Solver generates an initial RAG response, which the Proposer decomposes into claim-level verifiable atomic propositions. Crucially, the Checker validates these propositions against retrieved evidence in isolation, deprived of the Solver’s original output. This well-crafted information asymmetry scheme breaks the cycle of self-confirmation bias. By training this pipeline with multi-agent reinforcement learning (MARL), we enable the agents to co-evolve and optimize factual adherence. Extensive experiments across hallucination benchmarks demonstrate that MARCH substantially reduces hallucination rates. Notably, an 8B-parameter LLM equipped with MARCH achieves performance competitive with powerful closed-source models. MARCH paves a scalable path for factual self-improvement of LLMs through co-evolution. The code is at https://github.com/Qwen-Applications/MARCH.
2025
Beyond A Single AI Cluster: A Survey of Decentralized LLM Training
Haotian Dong | Jingyan Jiang | Rongwei Lu | Jiajun Luo | Jiajun Song | Bowen Li | Ying Shen | Zhi Wang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Haotian Dong | Jingyan Jiang | Rongwei Lu | Jiajun Luo | Jiajun Song | Bowen Li | Ying Shen | Zhi Wang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
The emergence of large language models (LLMs) has revolutionized AI development, yet their resource demands beyond a single cluster or even datacenter, limiting accessibility to well-resourced organizations. Decentralized training has emerged as a promising paradigm to leverage dispersed resources across clusters, datacenters and even regions, offering the potential to democratize LLM development for broader communities. As the first comprehensive exploration of this emerging field, we present decentralized LLM training as a resource-driven paradigm and categorize existing efforts into community-driven and organizational approaches. We further clarify this through: (1) a comparison with related paradigms, (2) characterization of decentralized resources, and (3) a taxonomy of recent advancements. We also provide up-to-date case studies and outline future directions to advance research in decentralized LLM training.