Xiawu Zheng
2026
Relaxing the Constraints: A Dual-Importance Projection Mechanism for Lifelong Model Editing
Zhenghai Chen | Senbin Xu | Jiaxi Tan | Xinhua Wu | Yan Zhang | Xiawu Zheng | Shengchuan Zhang | Ke Li | Sicheng Zhao | Liujuan Cao | Rongrong Ji
Findings of the Association for Computational Linguistics: ACL 2026
Zhenghai Chen | Senbin Xu | Jiaxi Tan | Xinhua Wu | Yan Zhang | Xiawu Zheng | Shengchuan Zhang | Ke Li | Sicheng Zhao | Liujuan Cao | Rongrong Ji
Findings of the Association for Computational Linguistics: ACL 2026
Factual knowledge stored in Large Language Models (LLMs) inevitably becomes outdated or erroneous over time, making it critical to update these models without incurring the high cost of retraining. Existing sequential knowledge editing methods predominantly rely on strict orthogonal projection to preserve previously edited knowledge. However, this excessive constraint limits gradient expressiveness, resulting in a significant degradation of model generalization and overall performance as the number of edits increases. To address this challenge, we propose Dual-Importance Projection Editing (DipEdit). This method leverages Singular Value Decomposition (SVD) to identify critical gradient subspaces and introduces a dual mechanism comprising "accumulated importance" and "projection importance." Unlike traditional approaches that enforce strict orthogonality, DipEdit dynamically scales gradient components parallel to key subspaces based on their projection importance rather than discarding them directly. This approach enhances the model’s adaptability to new knowledge while maximally preserving historical knowledge. Extensive experiments conducted on five mainstream LLMs using the ZsRE and Counterfact datasets demonstrate that DipEdit effectively handles thousands of sequential edits. The proposed method achieves an average comprehensive performance improvement of 10.36% and effectively maintains the model’s general capabilities on downstream tasks. Code is available at: https://github.com/czhhhla/DipEdit.
ALGOGEN: Tool-Generated Verifiable Traces for Reliable Algorithm Visualization
Liaokunpeng | Yuexiao Ma | Yisheng Lin | Hualin Zeng | Xiawu Zheng | Rongrong Ji
Findings of the Association for Computational Linguistics: ACL 2026
Liaokunpeng | Yuexiao Ma | Yisheng Lin | Hualin Zeng | Xiawu Zheng | Rongrong Ji
Findings of the Association for Computational Linguistics: ACL 2026
Algorithm Visualization (AV) helps students build mental models by animating algorithm execution states. Recent LLM-based systems such as CODE2VIDEO generate AV videos in an end-to-end manner. However, this paradigm requires the system to simultaneously simulate algorithm flow and satisfy video rendering constraints (element layout, color schemes, etc.), a complex task that induces LLM hallucinations. This results in reduced execution success rates, element overlap, and inter-frame inconsistencies.To address these challenges, we propose ALGOGEN, a novel paradigm that decouples algorithm execution from rendering. We first introduce Visualization Trace Algebra (VTA), a monoid over algorithm visual states and operations. The LLM then generates a Python tracker that simulates algorithm flow and outputs VTA-JSON traces, a JSON encoding of VTA. For rendering, we define a Rendering Style Language (RSL) to templatize algorithm layouts. A deterministic renderer then compiles algorithm traces with RSL into Manim, LaTeX/TikZ, or Three.js outputs[Manim, TikZ, and Three.js are respectively a Python animation engine, a LaTeX vector graphics package, and a JavaScript 3D rendering library.].Evaluated on a LeetCode AV benchmark of 200 tasks, ALGOGEN achieves an average success rate improvement of 17.3% compared to end-to-end methods (99.8% vs. 82.5%). These results demonstrate that our decoupling paradigm effectively mitigates LLM hallucinations in complex AV tasks, providing a more reliable solution for automated generation of high-quality algorithm visualizations. Demo videos and code are available at: .
2025
Automated Fine-Grained Mixture-of-Experts Quantization
Zhanhao Xie | Yuexiao Ma | Xiawu Zheng | Fei Chao | Wanchen Sui | Yong Li | Shen Li | Rongrong Ji
Findings of the Association for Computational Linguistics: ACL 2025
Zhanhao Xie | Yuexiao Ma | Xiawu Zheng | Fei Chao | Wanchen Sui | Yong Li | Shen Li | Rongrong Ji
Findings of the Association for Computational Linguistics: ACL 2025
The Mixture of Experts (MoE) architecture enables efficient model scaling through conditional computation, where only subset of parameters are activated per input. However, this distributed architecture poses unprecedented challenges for model compression, as conventional quantization methods optimized for dense networks prove inadequate. This paper introduces a specialized quantization framework for MoE architectures, motivated by our discovery that weight matrices across expert networks exhibit distinctive channel-wise outlier distributions, necessitating a more nuanced compression approach. Through theoretical analysis incorporating Fisher Information matrices and condition number characteristics, we establish a fundamental relationship between layer functionality and quantization sensitivity, demonstrating that down-projection layers inherently demand higher precision compared to up-projection layers. Leveraging these insights, we develop an automated channel-wise quantization framework that dynamically determines optimal bit-width allocations while maintaining minimal computational overhead through efficient statistical approximations. When evaluated on the Mixtral-8x7b-v0.1 architecture, our methodology demonstrates a 3.96% improvement over existing state-of-the-art approaches across natural language understanding benchmarks, while achieving superior compression ratios.
Data Interpreter: An LLM Agent for Data Science
Sirui Hong | Yizhang Lin | Bang Liu | Bangbang Liu | Binhao Wu | Ceyao Zhang | Danyang Li | Jiaqi Chen | Jiayi Zhang | Jinlin Wang | Li Zhang | Lingyao Zhang | Min Yang | Mingchen Zhuge | Taicheng Guo | Tuo Zhou | Wei Tao | Robert Tang | Xiangtao Lu | Xiawu Zheng | Xinbing Liang | Yaying Fei | Yuheng Cheng | Yongxin Ni | Zhibin Gou | Zongze Xu | Yuyu Luo | Chenglin Wu
Findings of the Association for Computational Linguistics: ACL 2025
Sirui Hong | Yizhang Lin | Bang Liu | Bangbang Liu | Binhao Wu | Ceyao Zhang | Danyang Li | Jiaqi Chen | Jiayi Zhang | Jinlin Wang | Li Zhang | Lingyao Zhang | Min Yang | Mingchen Zhuge | Taicheng Guo | Tuo Zhou | Wei Tao | Robert Tang | Xiangtao Lu | Xiawu Zheng | Xinbing Liang | Yaying Fei | Yuheng Cheng | Yongxin Ni | Zhibin Gou | Zongze Xu | Yuyu Luo | Chenglin Wu
Findings of the Association for Computational Linguistics: ACL 2025
Large Language Model (LLM)-based agents have excelled in various domains but face significant challenges when applied to data science workflows due to their complex, multi-stage nature. Current LLM-based agents struggle with non-linear relationships, recursive dependencies, implicit data- and logic-dependent reasoning, and managing extensive context. In this paper, we introduce Data Interpreter, an LLM-based agent that addresses these challenges through hierarchical graph-based modeling to represent the complexity and a progressive strategy for step-by-step verification, refinement, and consistent context management. Extensive experiments confirm the effectiveness of Data Interpreter. On InfiAgent-DABench, it boosts performance by 25% (from 75.9% to 94.9%), and on machine learning and open-ended tasks, it lifts accuracy from 88% to 95% and from 60% to 97%, respectively. Moreover, our method surpasses state-of-the-art baselines by 26% on the MATH dataset. We will release the code upon publication.
Learning Transition Patterns by Large Language Models for Sequential Recommendation
Jianyang Zhai | Zi-Feng Mai | Dongyi Zheng | Chang-Dong Wang | Xiawu Zheng | Hui Li | Feidiao Yang | Yonghong Tian
Proceedings of the 31st International Conference on Computational Linguistics
Jianyang Zhai | Zi-Feng Mai | Dongyi Zheng | Chang-Dong Wang | Xiawu Zheng | Hui Li | Feidiao Yang | Yonghong Tian
Proceedings of the 31st International Conference on Computational Linguistics
Large Language Models (LLMs) have demonstrated powerful performance in sequential recommendation due to their robust language modeling and comprehension capabilities. In such paradigms, the item texts of interaction sequences are formulated as sentences and LLMs are utilized to learn language representations or directly generate target item texts by incorporating instructions. Despite their promise, these methods solely focus on modeling the mapping from sequential texts to target items, neglecting the relationship between the items in an interaction sequence. This results in a failure to learn the transition patterns between items, which reflect the dynamic change in user preferences and are crucial for predicting the next item. To tackle this issue, we propose a novel framework for mapping the sequential item texts to the sequential item IDs, named ST2SI. Specifically, we first introduce multi-query input and item linear projection (ILP) to model the conditional probability distribution of items. Then, we further propose ID alignment to address misalignment between item texts and item IDs by instruction tuning. Finally, we propose efficient ILP tuning to adapt flexibly to different scenarios, requiring only training a linear layer to achieve competitive performance. Extensive experiments on six real-world datasets show our approach outperforms the best baselines by 7.33% in NDCG@10, 4.65% in Recall@10, and 8.42% in MRR.
Search
Fix author
Co-authors
- Rongrong Ji 3
- Yuexiao Ma 2
- Liujuan Cao 1
- Fei Chao 1
- Jiaqi Chen 1
- Zhenghai Chen 1
- Yuheng Cheng 1
- Yaying Fei 1
- Zhibin Gou 1
- Taicheng Guo 1
- Sirui Hong 1
- Yong Li 1
- Shen Li 1
- Danyang Li 1
- Ke Li 1
- Hui Li 1
- Xinbing Liang 1
- Liaokunpeng 1
- Yizhang Lin 1
- Yisheng Lin 1
- Bang Liu 1
- Bangbang Liu 1
- Xiangtao Lu 1
- Yuyu Luo 1
- Zi-Feng Mai 1
- Yongxin Ni 1
- Wanchen Sui 1
- Jiaxi Tan 1
- Robert Tang 1
- Wei Tao 1
- Yonghong Tian 1
- Jinlin Wang 1
- Chang-Dong Wang 1
- Binhao Wu 1
- Chenglin Wu 1
- Xinhua Wu 1
- Zhanhao Xie 1
- Zongze Xu 1
- Senbin Xu 1
- Min Yang 1
- Feidiao Yang 1
- Hualin Zeng 1
- Jianyang Zhai 1
- Ceyao Zhang 1
- Jiayi Zhang 1
- Li Zhang 1
- Lingyao Zhang 1
- Yan Zhang 1
- Shengchuan Zhang 1
- Sicheng Zhao 1
- Dongyi Zheng 1
- Tuo Zhou 1
- Mingchen Zhuge 1