Qi Zhang
Other people with similar names: Qi Zhang, Qi Zhang, Qi Zhang, Qi Zhang, Qi Zhang, Qi Zhang
Unverified author pages with similar names: Qi Zhang
2026
RLSeek: Evidence-Grounded Reasoning for RAG Hallucination Detection
Zhaoheng Huang | Dacheng Wen | Yutao Zhu | Xiaoying Lian | Yushi Liang | Kai Hao | Nan Li | Liangjie Zhang | Qi Zhang | Ji-Rong Wen | Zhicheng Dou | Fangzhao Wu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhaoheng Huang | Dacheng Wen | Yutao Zhu | Xiaoying Lian | Yushi Liang | Kai Hao | Nan Li | Liangjie Zhang | Qi Zhang | Ji-Rong Wen | Zhicheng Dou | Fangzhao Wu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) in retrieval-augmented generation systems can still produce hallucinations, generating content that is unsupported or contradicted by the source texts and undermines reliability. Recent work addressed this problem by training span-level hallucination detectors using reinforcement learning (RL) and chain-of-thought (CoT) reasoning. In this work, we show through error analysis that incorrect predictions by existing reasoning-based detectors are strongly associated with CoT processes that lack explicit grounding in source evidence, particularly when verification steps do not quote or verify claims against the retrieved documents. This behaviour contrasts with human verification practices in benchmarks such as RAGTruth, where evidence quotation is a prerequisite for determining hallucinated spans. Motivated by this observation, we propose an evidence-grounded RL framework, namely RLSeek, to explicitly enforce active evidence seeking during CoT reasoning by requiring quotation of relevant source segments at each verification step. Experiments on the RAGTruth and NewsSum dataset demonstrate consistent improvements in hallucination span detection performance, with limited additional reasoning overhead and improved robustness in out-of-domain settings.
Mnemis: Dual-Route Retrieval on Hierarchical Graphs for Long-Term LLM Memory
Zihao Tang | Xin Yu | Ziyu Xiao | Zengxuan Wen | Zelin Li | Jiaxi Zhou | Hualei Wang | Haohua Wang | Haizhen Huang | Weiwei Deng | Feng Sun | Qi Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zihao Tang | Xin Yu | Ziyu Xiao | Zengxuan Wen | Zelin Li | Jiaxi Zhou | Hualei Wang | Haohua Wang | Haizhen Huang | Weiwei Deng | Feng Sun | Qi Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
AI Memory, specifically how models organizes and retrieves historical messages, becomes increasingly valuable to Large Language Models (LLMs), yet existing methods (RAG and Graph-RAG) primarily retrieve memory through similarity-based mechanisms. While efficient, such System-1-style retrieval struggles with scenarios that require global reasoning or comprehensive coverage of all relevant information. In this work, We propose Mnemis, a novel memory framework that integrates System-1 similarity search with a complementary System-2 mechanism, termed Global Selection. Mnemis organizes memory into a base graph for similarity retrieval and a hierarchical graph that enables top-down, deliberate traversal over semantic hierarchies. By combining the complementary strength from both retrieval routes, Mnemis retrieves memory items that are both semantically and structurally relevant. Mnemis achieves state-of-the-art performance across all compared methods on long-term memory benchmarks, scoring 93.9 on LoCoMo and 91.6 on LongMemEval-S using GPT-4.1-mini.