TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models

Junbing Yan; Chengyu Wang; Taolin Zhang; Xiaofeng He; Jun Huang; Wei Zhang; Longtao Huang; Hui Xue

TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models

Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei Zhang, Longtao Huang, Hui Xue

Abstract

KEPLMs are pre-trained models that utilize external knowledge to enhance language understanding. Previous language models facilitated knowledge acquisition by incorporating knowledge-related pre-training tasks learned from relation triples in knowledge graphs. However, these models do not prioritize learning embeddings for entity-related tokens. Updating all parameters in KEPLM is computationally demanding. This paper introduces TRELM, a Robust and Efficient Pre-training framework for Knowledge-Enhanced Language Models. We observe that text corpora contain entities that follow a long-tail distribution, where some are suboptimally optimized and hinder the pre-training process. To tackle this, we employ a robust approach to inject knowledge triples and employ a knowledge-augmented memory bank to capture valuable information. Moreover, updating a small subset of neurons in the feed-forward networks (FFNs) that store factual knowledge is both sufficient and efficient. Specifically, we utilize dynamic knowledge routing to identify knowledge paths in FFNs and selectively update parameters during pre-training. Experimental results show that TRELM achieves at least a 50% reduction in pre-training time and outperforms other KEPLMs in knowledge probing tasks and multiple knowledge-aware language understanding tasks.

Anthology ID:: 2024.lrec-main.1461
Volume:: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:: LREC | COLING
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 16790–16801
Language:
URL:: https://aclanthology.org/2024.lrec-main.1461
DOI:
Bibkey:
Cite (ACL):: Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei Zhang, Longtao Huang, and Hui Xue. 2024. TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 16790–16801, Torino, Italia. ELRA and ICCL.
Cite (Informal):: TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models (Yan et al., LREC-COLING 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-3/2024.lrec-main.1461.pdf

PDF Search