Guangyu Wang
2026
PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations
Yuhe Wu | Guangyu Wang | Yuran Chen | Jiatong Zhang | Yutong Zhang | Yujie Chen | Jiaming Shang | Guang Zhang | Zhuang Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yuhe Wu | Guangyu Wang | Yuran Chen | Jiatong Zhang | Yutong Zhang | Yujie Chen | Jiaming Shang | Guang Zhang | Zhuang Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
As large language models (LLMs) evolve from conversational assistants into agents capable of handling complex tasks, they are increasingly deployed in high-risk domains. However, existing benchmarks largely rely on mixed queries and posterior evaluation, output-level scoring, which quantifies hallucination severity but offers limited insight into where and why hallucinations arise in the generation pipeline. We therefore reformulate hallucination evaluation as a diagnostic problem and propose PRISM, a controlled benchmark that disentangles hallucinations into four dimensions: knowledge missing, knowledge errors, reasoning errors, and instruction-following errors, grounded in three stages of generation (memory, instruction, and reasoning). PRISM contains 9,448 instances across 65 tasks and supports fine-grained, stage-aware diagnostic evaluation. Evaluating 24 mainstream open-source and proprietary LLMs, we uncover consistent trade-offs across instruction following, memory retrieval, and logical reasoning, showing that mitigation strategies often improve specific dimensions at the expense of others.We hope PRISM provides a framework for understanding the specific mechanisms behind LLMs hallucinations, ultimately accelerating the development of trustworthy large language models.
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Junbo Niu | Zheng Liu | Zhuangcheng Gu | Bin Wang | Linke Ouyang | Zhiyuan Zhao | Tao Chu | Tianyao He | Fan Wu | Qintong Zhang | Zhenjiang Jin | Guang Liang | Rui Zhang | Wenzheng Zhang | Yuan Qu | Zhifei Ren | Yuefeng Sun | Zirui Tang | Boyu Niu | Yuanhong Zheng | Dongsheng Ma | Ziyang Miao | Hejun Dong | Siyi Qian | Junyuan Zhang | Fangdong Wang | Jingzhou Chen | Xiaomeng Zhao | Liqun Wei | Wei Li | Shasha Wang | RuiLiang Xu | Yuanyuan Cao | Lu Chen | Qianqian Wu | Huaiyu Gu | Lindong Lu | Dechen Lin | Shenguanlin | Xuanhe Zhou | Linfeng Zhang | Yuhang Zang | Xiaoyi Dong | Jiaqi Wang | Bo Zhang | Lei Bai | Pei Chu | Weijia Li | Jiang Wu | Lijun Wu | Zhenxiang Li | Guangyu Wang | Zhongying Tu | Chao Xu | Kai Chen | Bowen Zhou | Dahua Lin | Wentao Zhang | Conghui He
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Junbo Niu | Zheng Liu | Zhuangcheng Gu | Bin Wang | Linke Ouyang | Zhiyuan Zhao | Tao Chu | Tianyao He | Fan Wu | Qintong Zhang | Zhenjiang Jin | Guang Liang | Rui Zhang | Wenzheng Zhang | Yuan Qu | Zhifei Ren | Yuefeng Sun | Zirui Tang | Boyu Niu | Yuanhong Zheng | Dongsheng Ma | Ziyang Miao | Hejun Dong | Siyi Qian | Junyuan Zhang | Fangdong Wang | Jingzhou Chen | Xiaomeng Zhao | Liqun Wei | Wei Li | Shasha Wang | RuiLiang Xu | Yuanyuan Cao | Lu Chen | Qianqian Wu | Huaiyu Gu | Lindong Lu | Dechen Lin | Shenguanlin | Xuanhe Zhou | Linfeng Zhang | Yuhang Zang | Xiaoyi Dong | Jiaqi Wang | Bo Zhang | Lei Bai | Pei Chu | Weijia Li | Jiang Wu | Lijun Wu | Zhenxiang Li | Guangyu Wang | Zhongying Tu | Chao Xu | Kai Chen | Bowen Zhou | Dahua Lin | Wentao Zhang | Conghui He
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
We introduce MinerU2.5, a 1.2B-parameter document parsing vision-language model that achieves state-of-the-art recognition accuracy while maintaining exceptional computational efficiency. Our approach employs a coarse-to-fine, two-stage parsing strategy that decouples global layout analysis from local content recognition. In the first stage, the model performs efficient layout analysis on downsampled images to identify structural elements, circumventing the computational overhead of processing high-resolution inputs. In the second stage, guided by the global layout, it performs targeted content recognition on native-resolution crops extracted from the original image, preserving fine-grained details in dense text, complex formulas, and tables. To support this strategy, we developed a comprehensive data engine that generates diverse, large-scale training corpora for both pretraining and fine-tuning. Ultimately, MinerU2.5 demonstrates strong document parsing ability, achieving state-of-the-art performance on multiple benchmarks, surpassing both general-purpose and domain-specific models across various recognition tasks, while maintaining significantly lower computational overhead.
Search
Fix author
Co-authors
- Lei Bai 1
- Yuanyuan Cao 1
- Jingzhou Chen 1
- Kai Chen 1
- Lu Chen 1
- Yujie Chen 1
- Yuran Chen 1
- Pei Chu 1
- Tao Chu 1
- Hejun Dong 1
- Xiaoyi Dong 1
- Huaiyu Gu 1
- Zhuangcheng Gu 1
- Conghui He 1
- Tianyao He 1
- Zhenjiang Jin 1
- Wei Li 1
- Weijia Li 1
- Zhenxiang Li 1
- Guang Liang 1
- Dahua Lin 1
- Dechen Lin 1
- Zheng Liu 1
- Zhuang Liu 1
- Lindong Lu 1
- Dongsheng Ma 1
- Ziyang Miao 1
- Boyu Niu 1
- Junbo Niu 1
- Linke Ouyang 1
- Siyi Qian 1
- Yuan Qu 1
- Zhifei Ren 1
- Jiaming Shang 1
- Shenguanlin 1
- Yuefeng Sun 1
- Zirui Tang 1
- Zhongying Tu 1
- Bin Wang 1
- Fangdong Wang 1
- Jiaqi Wang 1
- Shasha Wang 1
- Liqun Wei 1
- Fan Wu 1
- Jiang Wu 1
- Lijun Wu 1
- Qianqian Wu 1
- Yuhe Wu 1
- Chao Xu 1
- RuiLiang Xu 1
- Yuhang Zang 1
- Bo Zhang 1
- Guang Zhang 1
- Jiatong Zhang 1
- Junyuan Zhang 1
- Linfeng Zhang 1
- Qintong Zhang 1
- Rui Zhang 1
- Wentao Zhang 1
- Wenzheng Zhang 1
- Yutong Zhang 1
- Xiaomeng Zhao 1
- Zhiyuan Zhao 1
- Yuanhong Zheng 1
- Bowen Zhou 1
- Xuanhe Zhou 1
Venues
- ACL2