ELLE: Efficient Lifelong Pre-training for Emerging Data

Yujia Qin; Jiajie Zhang; Yankai Lin; Zhiyuan Liu; Peng Li; Maosong Sun; Jie Zhou

doi:10.18653/v1/2022.findings-acl.220

ELLE: Efficient Lifelong Pre-training for Emerging Data

Yujia Qin, Jiajie Zhang, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou

Abstract

Current pre-trained language models (PLM) are typically trained with static data, ignoring that in real-world scenarios, streaming data of various sources may continuously grow. This requires PLMs to integrate the information from all the sources in a lifelong manner. Although this goal could be achieved by exhaustive pre-training on all the existing data, such a process is known to be computationally expensive. To this end, we propose ELLE, aiming at efficient lifelong pre-training for emerging data. Specifically, ELLE consists of (1) function preserved model expansion, which flexibly expands an existing PLM’s width and depth to improve the efficiency of knowledge acquisition; and (2) pre-trained domain prompts, which disentangle the versatile knowledge learned during pre-training and stimulate the proper knowledge for downstream tasks. We experiment ELLE with streaming data from 5 domains on BERT and GPT. The results show the superiority of ELLE over various lifelong learning baselines in both pre-training efficiency and downstream performances. The codes are publicly available at https://github.com/thunlp/ELLE.

Anthology ID:: 2022.findings-acl.220
Volume:: Findings of the Association for Computational Linguistics: ACL 2022
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2789–2810
Language:
URL:: https://aclanthology.org/2022.findings-acl.220
DOI:: 10.18653/v1/2022.findings-acl.220
Bibkey:
Cite (ACL):: Yujia Qin, Jiajie Zhang, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, and Jie Zhou. 2022. ELLE: Efficient Lifelong Pre-training for Emerging Data. In Findings of the Association for Computational Linguistics: ACL 2022, pages 2789–2810, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: ELLE: Efficient Lifelong Pre-training for Emerging Data (Qin et al., Findings 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/naacl24-info/2022.findings-acl.220.pdf
Software:: 2022.findings-acl.220.software.zip
Code: thunlp/elle

PDF Search Code Software