Chenming Shang
2026
Find Your Optimal Teacher: Personalized Data Synthesis via Router-Guided Multi-Teacher Distillation
Hengyuan Zhang | Shiping Yang | Xiao Liang | Chenming Shang | Yuxuan Jiang | Chaofan Tao | Jing Xiong | Hayden Kwok-Hay So | Ruobing Xie | Angel X Chang | Ngai Wong
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hengyuan Zhang | Shiping Yang | Xiao Liang | Chenming Shang | Yuxuan Jiang | Chaofan Tao | Jing Xiong | Hayden Kwok-Hay So | Ruobing Xie | Angel X Chang | Ngai Wong
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Training student models on synthetic data generated by strong teacher models is a promising approach to distilling the capabilities of teachers. However, existing studies reveal that stronger models are not always optimal teachers, suggesting a mismatch between the teacher’s output and the student’s learning ability. To address this issue, we propose PerSyn (Personalized data Synthesis), a novel and efficient approach that customizes synthetic data to align with the learning capabilities of the student model. Specifically, our PerSyn method routes each prompt to its optimal teacher via a query-level router that jointly considers the student models’ learnability and teacher models’ response quality. It successfully transfers the synthesis paradigm from the conventional "Generate then Select" to a more efficient manner, i.e., "Route then Generate", eliminating the need for all teacher models to generate parallel responses across the entire prompt set. Extensive experiments across different model families and scales demonstrate that PerSyn consistently outperforms all baselines on six benchmarks, including instruct tuning and math reasoning settings. Further analysis verifies the effectiveness of PerSyn and offers extra insights to propel future research. Our code is available at https://anonymous.4open.science/r/PerSyn-8D85.
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models
Hengyuan Zhang | Zhihao Zhang | Ercong Nie | Mingyang Wang | Zunhai Su | Yiwei Wang | Qianli Wang | Shuzhou Yuan | Xufeng Duan | Qibo Xue | Zeping Yu | Chenming Shang | Xiao Liang | Jing Xiong | Hui Shen | Chaofan Tao | Zhengwu Liu | Senjie Jin | Zhiheng Xi | Dongdong Zhang | Sophia Ananiadou | Tao Gui | Ruobing Xie | Hayden Kwok-Hay So | Hinrich Schuetze | Xuanjing Huang | Qi Zhang | Ngai Wong
Findings of the Association for Computational Linguistics: ACL 2026
Hengyuan Zhang | Zhihao Zhang | Ercong Nie | Mingyang Wang | Zunhai Su | Yiwei Wang | Qianli Wang | Shuzhou Yuan | Xufeng Duan | Qibo Xue | Zeping Yu | Chenming Shang | Xiao Liang | Jing Xiong | Hui Shen | Chaofan Tao | Zhengwu Liu | Senjie Jin | Zhiheng Xi | Dongdong Zhang | Sophia Ananiadou | Tao Gui | Ruobing Xie | Hayden Kwok-Hay So | Hinrich Schuetze | Xuanjing Huang | Qi Zhang | Ngai Wong
Findings of the Association for Computational Linguistics: ACL 2026
Mechanistic Interpretability (MI) has emerged as a vital approach to demystify the opaque decision-making of Large Language Models (LLMs). However, existing reviews primarily treat MI as an observational science, summarizing analytical insights while lacking a systematic framework for actionable intervention. To bridge this gap, we present a practical survey structured around the pipeline: "Locate, Steer, and Improve." We formally categorize Localizing (diagnosis) and Steering (intervention) methods based on specific Interpretable Objects to establish a rigorous intervention protocol. Furthermore, we demonstrate how this framework enables tangible improvements in Alignment, Capability, and Efficiency, effectively operationalizing MI as a practical engineering toolkit for model optimization. The curated paper list of this work is available at https://anonymous.4open.science/r/Act-MI-F068.
2025
ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Multilingual Contrastive Framework
Hengyuan Zhang | Chenming Shang | Sizhe Wang | Dongdong Zhang | Yiyao Yu | Feng Yao | Renliang Sun | Yujiu Yang | Furu Wei
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hengyuan Zhang | Chenming Shang | Sizhe Wang | Dongdong Zhang | Yiyao Yu | Feng Yao | Renliang Sun | Yujiu Yang | Furu Wei
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Although fine-tuning Large Language Models (LLMs) with multilingual data can rapidly enhance the multilingual capabilities of LLMs, they still exhibit a performance gap between the dominant language (e.g., English) and non-dominant ones due to the imbalance of training data across languages. To further enhance the performance of non-dominant languages, we propose ShifCon, a Shift-based multilingual Contrastive framework that aligns the internal forward process of other languages toward that of the dominant one. Specifically, it shifts the representations of non-dominant languages into the dominant language subspace, allowing them to access relatively rich information encoded in the model parameters. The enriched representations are then shifted back into their original language subspace before generation. Moreover, we introduce a subspace distance metric to pinpoint the optimal layer area for shifting representations and employ multilingual contrastive learning to further enhance the alignment of representations within this area. Experiments demonstrate that our ShifCon framework significantly enhances the performance of non-dominant languages, particularly for low-resource ones. Further analysis offers extra insights to verify the effectiveness of ShifCon and propel future research.
2023
Assisting Language Learners: Automated Trans-Lingual Definition Generation via Contrastive Prompt Learning
Hengyuan Zhang | Dawei Li | Yanran Li | Chenming Shang | Chufan Shi | Yong Jiang
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
Hengyuan Zhang | Dawei Li | Yanran Li | Chenming Shang | Chufan Shi | Yong Jiang
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
The standard definition generation task requires to automatically produce mono-lingual definitions (e.g., English definitions for English words), but ignores that the generated definitions may also consist of unfamiliar words for language learners. In this work, we propose a novel task of Trans-Lingual Definition Generation (TLDG), which aims to generate definitions in another language, i.e., the native speaker’s language. Initially, we explore the unsupervised manner of this task and build up a simple implementation of fine-tuning the multi-lingual machine translation model. Then, we develop two novel methods, Prompt Combination and Contrastive Prompt Learning, for further enhancing the quality of the generation. Our methods are evaluated against the baseline Pipeline method in both rich- and low-resource settings, and we empirically establish its superiority in generating higher-quality trans-lingual definitions.
Search
Fix author
Co-authors
- Hengyuan Zhang 4
- Xiao Liang (梁霄) 2
- Hayden Kwok-Hay So 2
- Chaofan Tao 2
- Ngai Wong 2
- Ruobing Xie 2
- Jing Xiong 2
- Dongdong Zhang 2
- Sophia Ananiadou 1
- Angel X Chang 1
- Xufeng Duan 1
- Tao Gui 1
- Xuan-Jing Huang (黄萱菁) 1
- Yong Jiang 1
- Yuxuan Jiang 1
- Senjie Jin 1
- Dawei Li 1
- Yanran Li 1
- Zhengwu Liu 1
- Ercong Nie 1
- Hinrich Schuetze 1
- Hui Shen 1
- Chufan Shi 1
- Zunhai Su 1
- Renliang Sun 1
- Sizhe Wang 1
- Mingyang Wang 1
- Yiwei Wang 1
- Qianli Wang 1
- Furu Wei 1
- Zhiheng Xi 1
- Qibo Xue 1
- Yujiu Yang 1
- Shiping Yang 1
- Feng Yao 1
- Yiyao Yu 1
- Zeping Yu 1
- Shuzhou Yuan 1
- Zhihao Zhang 1
- Qi Zhang 1