CoELM: Construction-Enhanced Language Modeling
Lvxiaowei Xu, Zhilin Gong, Jianhua Dai, Tianxiang Wang, Ming Cai, Jiawei Peng
Abstract
Recent studies have shown that integrating constructional information can improve the performance of pre-trained language models (PLMs) in natural language understanding. However, exploration into leveraging constructional information to enhance generative language models for natural language generation has been limited. Additionally, probing studies indicate that PLMs primarily grasp the syntactic structure of constructions but struggle to capture their semantics. In this work, we encode constructions as inductive biases to explicitly embed constructional semantics and guide the generation process. We begin by presenting a construction grammar induction framework designed to automatically identify constructions from corpora. Subsequently, we propose the Construction-Enhanced Language Model (CoELM). It introduces a construction-guided language modeling approach that employs a dynamic sequence reassembly strategy during pre-training. Extensive experiments have demonstrated the superiority of CoELM across various benchmarks.- Anthology ID:
- 2024.acl-long.542
- Volume:
- Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 10061–10081
- Language:
- URL:
- https://aclanthology.org/2024.acl-long.542
- DOI:
- Cite (ACL):
- Lvxiaowei Xu, Zhilin Gong, Jianhua Dai, Tianxiang Wang, Ming Cai, and Jiawei Peng. 2024. CoELM: Construction-Enhanced Language Modeling. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10061–10081, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- CoELM: Construction-Enhanced Language Modeling (Xu et al., ACL 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.acl-long.542.pdf