TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models

Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei Zhang, Longtao Huang, Hui Xue


Abstract
KEPLMs are pre-trained models that utilize external knowledge to enhance language understanding. Previous language models facilitated knowledge acquisition by incorporating knowledge-related pre-training tasks learned from relation triples in knowledge graphs. However, these models do not prioritize learning embeddings for entity-related tokens. Updating all parameters in KEPLM is computationally demanding. This paper introduces TRELM, a Robust and Efficient Pre-training framework for Knowledge-Enhanced Language Models. We observe that text corpora contain entities that follow a long-tail distribution, where some are suboptimally optimized and hinder the pre-training process. To tackle this, we employ a robust approach to inject knowledge triples and employ a knowledge-augmented memory bank to capture valuable information. Moreover, updating a small subset of neurons in the feed-forward networks (FFNs) that store factual knowledge is both sufficient and efficient. Specifically, we utilize dynamic knowledge routing to identify knowledge paths in FFNs and selectively update parameters during pre-training. Experimental results show that TRELM achieves at least a 50% reduction in pre-training time and outperforms other KEPLMs in knowledge probing tasks and multiple knowledge-aware language understanding tasks.
Anthology ID:
2024.lrec-main.1461
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
16790–16801
Language:
URL:
https://preview.aclanthology.org/icon-24-ingestion/2024.lrec-main.1461/
DOI:
Bibkey:
Cite (ACL):
Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei Zhang, Longtao Huang, and Hui Xue. 2024. TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 16790–16801, Torino, Italia. ELRA and ICCL.
Cite (Informal):
TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models (Yan et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/icon-24-ingestion/2024.lrec-main.1461.pdf