Ran Le
2026
PACE: Prefix-Protected and Difficulty-Aware Compression for Efficient Reasoning
Ruixiang Feng | Yuntao Wen | Silin Zhou | Ke Shi | Yifan Wang | Ran Le | Zhenwei An | Zongchao Chen | Chen Yang | Guangyue Peng | Yiming Jia | Dongsheng Wang | Tao Zhang | Lisi Chen | Yang Song | Shen Gao | Shuo Shang
Findings of the Association for Computational Linguistics: ACL 2026
Ruixiang Feng | Yuntao Wen | Silin Zhou | Ke Shi | Yifan Wang | Ran Le | Zhenwei An | Zongchao Chen | Chen Yang | Guangyue Peng | Yiming Jia | Dongsheng Wang | Tao Zhang | Lisi Chen | Yang Song | Shen Gao | Shuo Shang
Findings of the Association for Computational Linguistics: ACL 2026
Language Reasoning Models (LRMs) achieve strong performance by scaling test-time computation but often suffer from "overthinking", producing excessively long reasoning traces that increase latency and memory usage. Existing LRMs typically enforce conciseness with uniform length penalties, which over-compress crucial early deduction steps at the sequence level and indiscriminately penalize all queries at the group level. To solve these limitations, we propose PACE, a dual-level framework for prefix-protected and difficulty-aware compression under hierarchical supervision. At the sequence level, prefix-protected optimization employs decaying mixed rollouts to maintain valid reasoning paths while promoting conciseness. At the group level, difficulty-aware penalty dynamically scales length constraints based on query complexity, maintaining exploration for harder questions while curbing redundancy on easier ones. Extensive experiments on DeepSeek-R1-Distill-Qwen (1.5B/7B) demonstrate that PACE achieves a substantial reduction in token usage (up to 55.7%) while simultaneously improving accuracy (up to 4.1%) on math benchmarks, with generalization ability to code, science, and general domains.
2025
Lock on Target! Precision Unlearning via Directional Control
Yuntao Wen | Ruixiang Feng | Feng Guo | Yifan Wang | Ran Le | Yang Song | Shen Gao | Shuo Shang
Findings of the Association for Computational Linguistics: EMNLP 2025
Yuntao Wen | Ruixiang Feng | Feng Guo | Yifan Wang | Ran Le | Yang Song | Shen Gao | Shuo Shang
Findings of the Association for Computational Linguistics: EMNLP 2025
The unlearning method aims at effectively removing harmful, sensitive, or outdated knowledge without costly retraining the model. However, existing methods suffer from two critical limitations: (1) collateral forgetting, where erasing target data inadvertently removes related but desirable knowledge, and (2) generality forgetting, where aggressive unlearning degrades the model’s general capabilities. To address these challenges, we propose DirectiOn Guide unlEarning (DOGE), a novel method that enables precise knowledge erasure by identifying and leveraging a targeted “unlearning direction” in the model’s parameter space. DOGE first extracts this direction through differential analysis of representations for forgotten and retained samples, pinpointing the exact subspace associated with unwanted knowledge. It then selectively applies updates along this direction, ensuring minimal interference with retained information and general model performance. Experiments across multiple benchmarks demonstrate that Doge achieves state-of-the-art unlearning precision while preserving both related knowledge and general capabilities.
2020
Translation vs. Dialogue: A Comparative Analysis of Sequence-to-Sequence Modeling
Wenpeng Hu | Ran Le | Bing Liu | Jinwen Ma | Dongyan Zhao | Rui Yan
Proceedings of the 28th International Conference on Computational Linguistics
Wenpeng Hu | Ran Le | Bing Liu | Jinwen Ma | Dongyan Zhao | Rui Yan
Proceedings of the 28th International Conference on Computational Linguistics
Understanding neural models is a major topic of interest in the deep learning community. In this paper, we propose to interpret a general neural model comparatively. Specifically, we study the sequence-to-sequence (Seq2Seq) model in the contexts of two mainstream NLP tasks–machine translation and dialogue response generation–as they both use the seq2seq model. We investigate how the two tasks are different and how their task difference results in major differences in the behaviors of the resulting translation and dialogue generation systems. This study allows us to make several interesting observations and gain valuable insights, which can be used to help develop better translation and dialogue generation models. To our knowledge, no such comparative study has been done so far.
2019
Who Is Speaking to Whom? Learning to Identify Utterance Addressee in Multi-Party Conversations
Ran Le | Wenpeng Hu | Mingyue Shang | Zhenjun You | Lidong Bing | Dongyan Zhao | Rui Yan
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Ran Le | Wenpeng Hu | Mingyue Shang | Zhenjun You | Lidong Bing | Dongyan Zhao | Rui Yan
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Previous research on dialogue systems generally focuses on the conversation between two participants, yet multi-party conversations which involve more than two participants within one session bring up a more complicated but realistic scenario. In real multi- party conversations, we can observe who is speaking, but the addressee information is not always explicit. In this paper, we aim to tackle the challenge of identifying all the miss- ing addressees in a conversation session. To this end, we introduce a novel who-to-whom (W2W) model which models users and utterances in the session jointly in an interactive way. We conduct experiments on the benchmark Ubuntu Multi-Party Conversation Corpus and the experimental results demonstrate that our model outperforms baselines with consistent improvements.