Sunbowen Lee
2026
Finding RELIEF: Shaping Reasoning Behavior without Reasoning Supervision via Belief Engineering
Chak Tou Leong | Dingwei Chen | Heming Xia | Qingyu Yin | Sunbowen Lee | Jian Wang | Wenjie Li
Findings of the Association for Computational Linguistics: ACL 2026
Chak Tou Leong | Dingwei Chen | Heming Xia | Qingyu Yin | Sunbowen Lee | Jian Wang | Wenjie Li
Findings of the Association for Computational Linguistics: ACL 2026
Large reasoning models (LRMs) have achieved remarkable success through step-by-step chains of thought, yet they often suffer from excessive redundancy or unfaithful reasoning. Existing methods for shaping LRM behavior typically rely on reinforcement learning or fine-tuning with gold-standard reasoning traces, a paradigm that is both computationally expensive and difficult to scale. In this paper, we reveal that LRMs possess latent reasoning beliefs that internally track their own reasoning traits, which can be captured through simple logit probing without specialized training. Building on this insight, we propose Reasoning Belief Engineering (), a simple yet effective framework that shapes LRM behavior by aligning the model’s self-concept with a target belief blueprint. Crucially, completely bypasses the need for reasoning-trace supervision. It internalizes desired traits by fine-tuning on synthesized, self-reflective QA pairs that affirm the target belief. Extensive experiments on efficiency and faithfulness tasks demonstrate that matches or outperforms behavior-supervised and preference-based baselines while requiring significantly lower training costs. Our analysis further validates that shifting a model’s reasoning belief effectively shapes its actual behavior.
A Multilingual Dataset and Empirical Validation for the Mutual Reinforcement Effect in Information Extraction
Chengguang Gan | Sunbowen Lee | Qingyu Yin | Yunhao Liang | Xinyang He | Hanjun Wei | Younghun Lim | Shijian Wang | Hexiang Huang | QingHao Zhang | Shiwen Ni | Tatsunori Mori
Findings of the Association for Computational Linguistics: ACL 2026
Chengguang Gan | Sunbowen Lee | Qingyu Yin | Yunhao Liang | Xinyang He | Hanjun Wei | Younghun Lim | Shijian Wang | Hexiang Huang | QingHao Zhang | Shiwen Ni | Tatsunori Mori
Findings of the Association for Computational Linguistics: ACL 2026
The Mutual Reinforcement Effect (MRE) describes a phenomenon in information extraction where word-level and sentence-level tasks can mutually improve each other when jointly modeled. While prior work has reported MRE in Japanese, its generality across languages and task settings has not been empirically validated, largely due to the lack of multilingual MRE datasets. To address this limitation, we introduce the Multilingual MRE Mix dataset (MMM), which consists of 21 sub-datasets covering English, Japanese, and Chinese. We propose an LLM-assisted dataset translation and alignment framework that significantly reduces manual annotation effort while preserving the structural requirements of MRE tasks. Building on MMM, we adopt a unified input-output framework to train an open-domain information extraction model and conduct extensive empirical studies, including full fine-tuning ablations and the construction of knowledgeable verbalizers based on MRE-mix data. Experimental results show that 76 percent of the MMM sub-datasets consistently exhibit the Mutual Reinforcement Effect across languages. These findings provide systematic empirical validation of MRE in multilingual settings and demonstrate its practical value for information extraction.
2025
Quantification of Large Language Model Distillation
Sunbowen Lee | Junting Zhou | Chang Ao | Kaige Li | Xeron Du | Sirui He | Haihong Wu | Tianci Liu | Jiaheng Liu | Hamid Alinejad-Rokny | Min Yang | Yitao Liang | Zhoufutu Wen | Shiwen Ni
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Sunbowen Lee | Junting Zhou | Chang Ao | Kaige Li | Xeron Du | Sirui He | Haihong Wu | Tianci Liu | Jiaheng Liu | Hamid Alinejad-Rokny | Min Yang | Yitao Liang | Zhoufutu Wen | Shiwen Ni
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Model distillation is a fundamental technique in building large language models (LLMs), transferring knowledge from a teacher model to a student model. However, distillation can lead to model homogenization, reducing diversity among models and impairing their ability to robustly handle complex or novel tasks. These limitations underscore the need to systematically quantify the distillation process and its impact. In this work, we propose a framework to evaluate and quantify model distillation. Our method addresses two key aspects: (1) Identifying identity cognition contradictions to assess discrepancies in how models perceive and represent identity-related information, and (2) Analyzing multi-granularity response similarities across models to measure the extent of homogenization. Experimental results demonstrate two key insights: (1) Well-known closed-source and open-source LLMs usually exhibit high distillation degrees, except for Claude, Doubao, and Gemini. (2) Base LLMs show higher distillation degrees compared to aligned LLMs. By offering a systematic approach to improve the transparency of LLM data distillation, we call for LLMs with more independent development and more transparent technical reports to improve LLMs’ robustness and safety. The code and data are available at https://github.com/Aegis1863/LLMs-Distillation-Quantification.