Jianhe Lin
2024
UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle
Yutao Mou
|
Kexiang Wang
|
Jianhe Lin
|
Dehong Ma
|
Jun Fan
|
Daiting Shi
|
Zhicong Cheng
|
Gu Simiu
|
Dawei Yin
|
Weiran Xu
Findings of the Association for Computational Linguistics: NAACL 2024
Pre-training and fine-tuning framework has become the standard training paradigm for NLP tasks and is also widely used in industrial-level applications. However, there are still a limitation with this paradigm: simply fine-tuning with task-specific objectives tends to converge to local minima, resulting in a sub-optimal performance. In this paper, we first propose a new paradigm: knowledge rekindle, which aims to re-incorporate the fine-tuned expert model into the training cycle and break through the performance upper bounds of experts without introducing additional annotated data. Then we further propose a unified expert-guided pre-training (UEGP) framework for knowledge rekindle. Specifically, we reuse fine-tuned expert models for various downstream tasks as knowledge sources and inject task-specific prior knowledge to pre-trained language models (PLMs) by means of knowledge distillation. In this process, we perform multi-task learning with knowledge distillation and masked language modeling (MLM) objectives. We also further explored whether mixture-of-expert guided pre-training (MoEGP) can further enhance the effect of knowledge rekindle. Experiments and analysis on eight datasets in GLUE benchmark and a industrial-level search re-ranking dataset show the effectiveness of our method.
Search
Co-authors
- Yutao Mou 1
- Kexiang Wang 1
- Dehong Ma 1
- Jun Fan 1
- Daiting Shi 1
- show all...