UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle

Yutao Mou; Kexiang Wang; Jianhe Lin; Dehong Ma; Jun Fan; Daiting Shi; Zhicong Cheng; Gu Simiu; Dawei Yin; Weiran Xu

UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle

Yutao Mou, Kexiang Wang, Jianhe Lin, Dehong Ma, Jun Fan, Daiting Shi, Zhicong Cheng, Gu Simiu, Dawei Yin, Weiran Xu

Abstract

Pre-training and fine-tuning framework has become the standard training paradigm for NLP tasks and is also widely used in industrial-level applications. However, there are still a limitation with this paradigm: simply fine-tuning with task-specific objectives tends to converge to local minima, resulting in a sub-optimal performance. In this paper, we first propose a new paradigm: knowledge rekindle, which aims to re-incorporate the fine-tuned expert model into the training cycle and break through the performance upper bounds of experts without introducing additional annotated data. Then we further propose a unified expert-guided pre-training (UEGP) framework for knowledge rekindle. Specifically, we reuse fine-tuned expert models for various downstream tasks as knowledge sources and inject task-specific prior knowledge to pre-trained language models (PLMs) by means of knowledge distillation. In this process, we perform multi-task learning with knowledge distillation and masked language modeling (MLM) objectives. We also further explored whether mixture-of-expert guided pre-training (MoEGP) can further enhance the effect of knowledge rekindle. Experiments and analysis on eight datasets in GLUE benchmark and a industrial-level search re-ranking dataset show the effectiveness of our method.

Anthology ID:: 2024.findings-naacl.170
Volume:: Findings of the Association for Computational Linguistics: NAACL 2024
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2661–2673
Language:
URL:: https://aclanthology.org/2024.findings-naacl.170
DOI:
Bibkey:
Cite (ACL):: Yutao Mou, Kexiang Wang, Jianhe Lin, Dehong Ma, Jun Fan, Daiting Shi, Zhicong Cheng, Gu Simiu, Dawei Yin, and Weiran Xu. 2024. UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 2661–2673, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle (Mou et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/naacl24-info/2024.findings-naacl.170.pdf