Jiawei Shen
2026
KCVR: Knowledge-Centric Video Reconstruction for Structured Pedagogical Summarization via Dynamic Graph Planning
Jingjiang Liu | Jia Zhu | Hanghui Guo | Weijie Shi | Yue Cui | Xiaokang Jin | Yilin Wang | Qingyu Niu | Jiawei Shen | Guoqing Ma | Yidan Liang | Shimin Di | Jiajie Xu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jingjiang Liu | Jia Zhu | Hanghui Guo | Weijie Shi | Yue Cui | Xiaokang Jin | Yilin Wang | Qingyu Niu | Jiawei Shen | Guoqing Ma | Yidan Liang | Shimin Di | Jiajie Xu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Existing video summarization methods mainly compress content for gist browsing, but they often break the prerequisite logic in instructional videos and induce logical inversions (e.g., conclusions before premises). We formalize this problem as Structure-Pedagogical Reconstruction (SPR). SPR raises two challenges: (1) Structure Hallucination, where retrieved knowledge is topologically valid but not evidence-grounded by the blackboard; and (2) Logical Inversion, where soft prompt-level graph injection fails to enforce prerequisite order during decoding. To address these challenges, we propose Knowledge-Centric Video Reconstruction (KCVR), a Plan-then-Generate neuro-symbolic framework that decouples epistemic planning from content generation. KCVR prunes a Dual-Layer Epistemic Graph into a minimal video-supported plan, then realizes the plan with visually anchored attention and topology-constrained decoding. We additionally release EduStruct, a 10-discipline benchmark for SPR and structure-centric evaluation. Experiments show that KCVR outperforms strong end-to-end baselines on Knowledge Progression Consistency and Learning Objective Coverage. Our code and data are available at https://github.com/mark1001-ljj/video_sum.
ACR: Adaptive Context Refactoring via Context Refactoring Operators for Multi-Turn Dialogue
Jiawei Shen | Jia Zhu | Hanghui Guo | Weijie Shi | Yue Cui | Qingyu Niu | Guoqing Ma | Jingjiang Liu | Yidan Liang | Yilin Wang | Shimin Di | Jiajie Xu
Findings of the Association for Computational Linguistics: ACL 2026
Jiawei Shen | Jia Zhu | Hanghui Guo | Weijie Shi | Yue Cui | Qingyu Niu | Guoqing Ma | Jingjiang Liu | Yidan Liang | Yilin Wang | Shimin Di | Jiajie Xu
Findings of the Association for Computational Linguistics: ACL 2026
Large Language Models (LLMs) have shown remarkable performance in multi-turn dialogue. However, in multi-turn dialogue, models still struggle to stay aligned with what has been established earlier, follow dependencies across many turns, and avoid drifting into incorrect facts as the interaction grows longer. Existing approaches primarily focus on extending the context window, introducing external memory, or applying context compression, yet these methods still face limitations such as contextual inertia and state drift. To address these challenges, we propose the Adaptive Context Refactoring (ACR) Framework, which dynamically monitors and reshapes the interaction history to mitigate contextual inertia and state drift actively. ACR is built on a library of context refactoring operators and a teacher-guided self-evolving training paradigm that learns when to intervene and how to refactor, thereby decoupling context management from the reasoning process. Extensive experiments on multi-turn dialogue demonstrate that our method significantly outperforms existing baselines while reducing token consumption. Our code is available at https://github.com/ClannadKno/multi-turn.
RSDA: Restoring Stale Data Affinity via Dynamic Renovation Strategy for Mitigating Data Scarcity
Yidan Liang | Jia Zhu | Weijie Shi | Hanghui Guo | Yue Cui | Jiawei Shen | Guoqing Ma | Jingjiang Liu | Qingyu Niu | Yilin Wang | Shimin Di | Jiajie Xu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yidan Liang | Jia Zhu | Weijie Shi | Hanghui Guo | Yue Cui | Jiawei Shen | Guoqing Ma | Jingjiang Liu | Qingyu Niu | Yilin Wang | Shimin Di | Jiajie Xu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
High-quality data is the cornerstone of advancing large language models. However, the field currently faces a critical dilemma: the supply of premium data is nearing depletion, while vast stale corpora remain underutilized. Our empirical analysis reveals that training models on such data directly often leads to performance degradation. We attribute this phenomenon to the data affinity gap, a misalignment stemming from the model’s inability to effectively comprehend the data or inherent quality defects. To bridge this gap, we propose Restoring Stale Data Affinity (RSDA) framework. First, utilizing our proposed potential entropy metric, RSDA quantifies the latent value of samples to effectively identify stale data with higher renovation potential. Subsequently, the framework employs a dynamic renovation strategy selection mechanism to determine the optimal component-level strategy for each instance, transforming low-affinity stale samples into high-quality training data. Comprehensive experimental results demonstrate that RSDA effectively enhances data affinity, achieving performance improvements using less than 10% of the data volume, thereby underscoring that the latent potential of stale corpora remains largely untapped. The code is available at https://github.com/wenfiii/RSDA.
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
Zhongyuan Peng | Yifan Yao | Kaijing Ma | Shuyue Guo | Yizhe Li | Yichi Zhang | Chenchen Zhang | Yifan Zhang | Zhouliang Yu | Luming Li | Minghao Liu | Yihang Xia | Jiawei Shen | Yuchen Wu | Yixin Cao | Zhaoxiang Zhang | Wenhao Huang | Jiaheng Liu | Ge Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhongyuan Peng | Yifan Yao | Kaijing Ma | Shuyue Guo | Yizhe Li | Yichi Zhang | Chenchen Zhang | Yifan Zhang | Zhouliang Yu | Luming Li | Minghao Liu | Yihang Xia | Jiawei Shen | Yuchen Wu | Yixin Cao | Zhaoxiang Zhang | Wenhao Huang | Jiaheng Liu | Ge Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Translating natural language mathematical statements into formal, executable code is a fundamental challenge in automated theorem proving. While prior work has focused on generation and compilation success, little attention has been paid to the critic phase—the evaluation of whether generated formalizations truly capture the semantic intent of the original problem. In this paper, we introduce CriticLean, a novel critic-guided reinforcement learning framework that elevates the role of the critic from a passive validator to an active learning component. Specifically, first, we propose the CriticLeanGPT, trained via supervised fine-tuning and reinforcement learning, to rigorously assess the semantic fidelity of Lean 4 formalizations. Then, we introduce CriticLeanBench, a benchmark designed to measure models’ ability to distinguish semantically correct from incorrect formalizations, and demonstrate that our trained CriticLeanGPT models can significantly outperform strong open- and closed-source baselines. Building on the CriticLean framework, we construct FineLeanCorpus, a dataset comprising over 509K problems that exhibits rich domain diversity, broad difficulty coverage, and high correctness based on human evaluation.Overall, our findings highlight that optimizing the critic phase is essential for producing reliable formalizations and we hope our CriticLean will provide valuable insights for future advances in formal mathematical reasoning.
Search
Fix author
Co-authors
- Yue Cui 3
- Shimin Di 3
- Hanghui Guo 3
- Yidan Liang 3
- Jingjiang Liu 3
- Guoqing Ma 3
- Qingyu Niu 3
- Weijie Shi 3
- Yilin Wang 3
- Jiajie Xu 3
- Jia Zhu 3
- Yixin Cao 1
- Shuyue Guo 1
- Wenhao Huang 1
- Xiaokang Jin 1
- Yizhe Li 1
- Luming Li 1
- Minghao Liu 1
- Jiaheng Liu 1
- Kaijing Ma 1
- Zhongyuan Peng 1
- Yuchen Wu 1
- Yihang Xia 1
- Yifan Yao 1
- Zhouliang Yu 1
- Yichi Zhang 1
- Chenchen Zhang 1
- Yifan Zhang 1
- Zhaoxiang Zhang 1
- Ge Zhang 1