Li Ding
2026
QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization
Changxin Ke | Rui Zhang | Jiaming Guo | Yuanbo Wen | Li Ding | Shuo Wang | Xuyuan Zhu | Xiong Peng | Di Huang | Zidong Du | Xing Hu | Qi Guo | Yunji Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Changxin Ke | Rui Zhang | Jiaming Guo | Yuanbo Wen | Li Ding | Shuo Wang | Xuyuan Zhu | Xiong Peng | Di Huang | Zidong Du | Xing Hu | Qi Guo | Yunji Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) achieve strong program repair performance but often suffer from over-editing, where excessive modifications overwrite correct code and hinder bug localization. We systematically quantify its impact and introduce precise repair task, which maximizes reuse of correct code while fixing only buggy parts. Building on this insight, we propose PRepair, a framework that mitigates over-editing and improves repair accuracy. PRepair has two components: Self-Breaking, which generates diverse buggy programs via controlled bug injection and min–max sampling, and Self-Repairing, which trains models with Edit-Aware Group Relative Policy Optimization (EA-GRPO) using an edit-aware reward to encourage minimal yet correct edits. Experiments show that PRepair improves repair precision by up to 31.4% under fix1@1, a metric that jointly considers repair correctness and extent, and significantly increases decoding throughput when combined with speculative editing, demonstrating its potential for precise and practical code repair.
2020
Tencent AI Lab Machine Translation Systems for WMT20 Chat Translation Task
Longyue Wang | Zhaopeng Tu | Xing Wang | Li Ding | Liang Ding | Shuming Shi
Proceedings of the Fifth Conference on Machine Translation
Longyue Wang | Zhaopeng Tu | Xing Wang | Li Ding | Liang Ding | Shuming Shi
Proceedings of the Fifth Conference on Machine Translation
This paper describes the Tencent AI Lab’s submission of the WMT 2020 shared task on chat translation in English-German. Our neural machine translation (NMT) systems are built on sentence-level, document-level, non-autoregressive (NAT) and pretrained models. We integrate a number of advanced techniques into our systems, including data selection, back/forward translation, larger batch learning, model ensemble, finetuning as well as system combination. Specifically, we proposed a hybrid data selection method to select high-quality and in-domain sentences from out-of-domain data. To better capture the source contexts, we exploit to augment NAT models with evolved cross-attention. Furthermore, we explore to transfer general knowledge from four different pre-training language models to the downstream translation task. In general, we present extensive experimental results for this new translation task. Among all the participants, our German-to-English primary system is ranked the second in terms of BLEU scores.