UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle
Yutao Mou, Kexiang Wang, Jianhe Lin, Dehong Ma, Jun Fan, Daiting Shi, Zhicong Cheng, Gu Simiu, Dawei Yin, Weiran Xu
Abstract
Pre-training and fine-tuning framework has become the standard training paradigm for NLP tasks and is also widely used in industrial-level applications. However, there are still a limitation with this paradigm: simply fine-tuning with task-specific objectives tends to converge to local minima, resulting in a sub-optimal performance. In this paper, we first propose a new paradigm: knowledge rekindle, which aims to re-incorporate the fine-tuned expert model into the training cycle and break through the performance upper bounds of experts without introducing additional annotated data. Then we further propose a unified expert-guided pre-training (UEGP) framework for knowledge rekindle. Specifically, we reuse fine-tuned expert models for various downstream tasks as knowledge sources and inject task-specific prior knowledge to pre-trained language models (PLMs) by means of knowledge distillation. In this process, we perform multi-task learning with knowledge distillation and masked language modeling (MLM) objectives. We also further explored whether mixture-of-expert guided pre-training (MoEGP) can further enhance the effect of knowledge rekindle. Experiments and analysis on eight datasets in GLUE benchmark and a industrial-level search re-ranking dataset show the effectiveness of our method.- Anthology ID:
- 2024.findings-naacl.170
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2024
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Kevin Duh, Helena Gomez, Steven Bethard
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2661–2673
- Language:
- URL:
- https://aclanthology.org/2024.findings-naacl.170
- DOI:
- Cite (ACL):
- Yutao Mou, Kexiang Wang, Jianhe Lin, Dehong Ma, Jun Fan, Daiting Shi, Zhicong Cheng, Gu Simiu, Dawei Yin, and Weiran Xu. 2024. UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 2661–2673, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle (Mou et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2024.findings-naacl.170.pdf