TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models
Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei Zhang, Longtao Huang, Hui Xue
Abstract
KEPLMs are pre-trained models that utilize external knowledge to enhance language understanding. Previous language models facilitated knowledge acquisition by incorporating knowledge-related pre-training tasks learned from relation triples in knowledge graphs. However, these models do not prioritize learning embeddings for entity-related tokens. Updating all parameters in KEPLM is computationally demanding. This paper introduces TRELM, a Robust and Efficient Pre-training framework for Knowledge-Enhanced Language Models. We observe that text corpora contain entities that follow a long-tail distribution, where some are suboptimally optimized and hinder the pre-training process. To tackle this, we employ a robust approach to inject knowledge triples and employ a knowledge-augmented memory bank to capture valuable information. Moreover, updating a small subset of neurons in the feed-forward networks (FFNs) that store factual knowledge is both sufficient and efficient. Specifically, we utilize dynamic knowledge routing to identify knowledge paths in FFNs and selectively update parameters during pre-training. Experimental results show that TRELM achieves at least a 50% reduction in pre-training time and outperforms other KEPLMs in knowledge probing tasks and multiple knowledge-aware language understanding tasks.- Anthology ID:
- 2024.lrec-main.1461
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 16790–16801
- Language:
- URL:
- https://aclanthology.org/2024.lrec-main.1461
- DOI:
- Cite (ACL):
- Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei Zhang, Longtao Huang, and Hui Xue. 2024. TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 16790–16801, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models (Yan et al., LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2024.lrec-main.1461.pdf