Peijie Wang
2026
Geoparsing: Diagram Parsing for Plane and Solid Geometry with a Unified Formal Language
Peijie Wang | Ming-Liang Zhang | Jun Cao | Chao Deng | Dekang Ran | Pi Bu | Hongda Sun | Xuan Zhang | Yingyao Wang | Jun Song | Bo Zheng | Fei Yin | Cheng-Lin Liu
Findings of the Association for Computational Linguistics: ACL 2026
Peijie Wang | Ming-Liang Zhang | Jun Cao | Chao Deng | Dekang Ran | Pi Bu | Hongda Sun | Xuan Zhang | Yingyao Wang | Jun Song | Bo Zheng | Fei Yin | Cheng-Lin Liu
Findings of the Association for Computational Linguistics: ACL 2026
Multimodal Large Language Models (MLLMs) have achieved remarkable progress but continue to struggle with geometric reasoning, primarily due to the perception bottleneck regarding fine-grained visual elements. While formal languages have aided plane geometry understanding, solid geometry which requires spatial understanding remains largely unexplored. In this paper, we address this challenge by designing a unified formal language that integrates plane and solid geometry, comprehensively covering geometric structures and semantic relations. We construct GDP-29K, a large-scale dataset comprising 20k plane and 9k solid geometry samples collected from diverse real-world sources, each paired with its ground-truth formal description. We propose a training paradigm combining Supervised Fine-Tuning with Reinforcement Learning via Verifiable Rewards, which effectively enforces syntactic correctness and geometric consistency. Experiments show that our approach achieves state-of-the-art parsing performance. Furthermore, we demonstrate that our parsed formal descriptions serve as a critical cognitive scaffold, significantly boosting MLLMs’ capabilities for downstream geometry reasoning tasks.
Too Long, Do Re-weighting for Efficient LLM Reasoning Compression
Zhong-Zhi Li | Xiao Liang | Zihao Tang | Lei Ji | Peijie Wang | Haotian Xu | Xing W | Haizhen Huang | Weiwei Deng | Yeyun Gong | Zhijiang Guo | Xiao Liu | Fei Yin | Cheng-Lin Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhong-Zhi Li | Xiao Liang | Zihao Tang | Lei Ji | Peijie Wang | Haotian Xu | Xing W | Haizhen Huang | Weiwei Deng | Yeyun Gong | Zhijiang Guo | Xiao Liu | Fei Yin | Cheng-Lin Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) have recently achieved remarkable progress on complex reasoning tasks by leveraging extended Chain-of-Thought (CoT) techniques. These reasoning processes can be roughly categorized into System-1 (fast and intuitive) and System-2 (slow and deliberate) paradigms. However, excessive reliance on lengthy System-2-style reasoning during inference can produce extremely long outputs, thereby reducing efficiency. In this work, we propose Thinking Length Data Re-weighting (TLDR), that does not rely on sophisticated data annotations or interpolation between multiple models. We continuously balance the weights between the model’s System-1 and System-2 data to eliminate redundant reasoning processes while preserving the model’s reasoning capability. We validate our method across multiple base models, including Deepseek-R1-Distilled Qwen models, as well as on a diverse benchmarks with varying difficulty levels. Our method significantly reduces the number of output tokens by nearly 40% while maintaining the accuracy of the reasoning.
2025
LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating
Chao Deng | Jiale Yuan | Pi Bu | Peijie Wang | Zhong-Zhi Li | Jian Xu | Xiao-Hui Li | Yuan Gao | Jun Song | Bo Zheng | Cheng-Lin Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Chao Deng | Jiale Yuan | Pi Bu | Peijie Wang | Zhong-Zhi Li | Jian Xu | Xiao-Hui Li | Yuan Gao | Jun Song | Bo Zheng | Cheng-Lin Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large vision language models (LVLMs) have improved the document understanding capabilities remarkably, enabling the handling of complex document elements, longer contexts, and a wider range of tasks. However, existing document understanding benchmarks have been limited to handling only a small number of pages and fail to provide a comprehensive analysis of layout elements locating. In this paper, we first define three primary task categories: Long Document Understanding, numerical Reasoning, and cross-element Locating, and then propose a comprehensive benchmark—LongDocURL—integrating above three primary tasks and comprising 20 sub-tasks categorized based on different primary tasks and answer evidences. Furthermore, we develop a semi-automated construction pipeline and collect 2,325 high-quality question-answering pairs, covering more than 33,000 pages of documents, significantly outperforming existing benchmarks. Subsequently, we conduct comprehensive evaluation experiments on both open-source and closed- source models across 26 different configurations, revealing critical performance gaps in this field. The code and data: https://github.com/dengc2023/LongDocURL.