Pengzhang Liu
2025
Beyond Logits: Aligning Feature Dynamics for Effective Knowledge Distillation
Guoqiang Gong
|
Jiaxing Wang
|
Jin Xu
|
Deping Xiang
|
Zicheng Zhang
|
Leqi Shen
|
Yifeng Zhang
|
JunhuaShu JunhuaShu
|
ZhaolongXing ZhaolongXing
|
Zhen Chen
|
Pengzhang Liu
|
Ke Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Knowledge distillation (KD) compresses large language models (LLMs), known as teacher models, into lightweight versions called student models, enabling efficient inference and downstream applications. However, prevailing approaches accomplish this by predominantly focusing on matching the final output distributions of student/teacher models. Drawing on the perspective that transformers can be viewed as discretizing ordinary differential equation (ODEs) on integer time steps (corresponding to layer indices), where intermediate features evolve across layers, we argue that effective KD requires aligning the entire feature dynamics between teacher and student models, which we call feature dynamics distillation (FDD). This alignment involves matching both the feature trajectory and its first-order derivative, rather than just the final states. Our approach extends the original KD objective with two additional loss terms: layer-wise feature KD, which matches discretized feature trajectory, and layer feature delta KD, which matches first-order changes in features across adjacent layers. Extensive experiments on various tasks validate the effectiveness of our distillation method.
2022
LEGO-ABSA: A Prompt-based Task Assemblable Unified Generative Framework for Multi-task Aspect-based Sentiment Analysis
Tianhao Gao
|
Jun Fang
|
Hanyu Liu
|
Zhiyuan Liu
|
Chao Liu
|
Pengzhang Liu
|
Yongjun Bao
|
Weipeng Yan
Proceedings of the 29th International Conference on Computational Linguistics
Aspect-based sentiment analysis (ABSA) has received increasing attention recently. ABSA can be divided into multiple tasks according to the different extracted elements. Existing generative methods usually treat the output as a whole string rather than the combination of different elements and only focus on a single task at once. This paper proposes a unified generative multi-task framework that can solve multiple ABSA tasks by controlling the type of task prompts consisting of multiple element prompts. Further, the proposed approach can train on simple tasks and transfer to difficult tasks by assembling task prompts, like assembling Lego bricks. We conduct experiments on six ABSA tasks across multiple benchmarks. Our proposed multi-task approach achieves new state-of-the-art results in almost all tasks and competitive results in task transfer scenarios.
Search
Fix author
Co-authors
- Yongjun Bao 1
- Zhen Chen 1
- Jun Fang 1
- Tianhao Gao 1
- Guoqiang Gong 1
- show all...