G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

Zhongwei Wan; Yichun Yin; Wei Zhang; Jiaxin Shi; Lifeng Shang; Guangyong Chen; Xin Jiang; Qun Liu

doi:10.18653/v1/2022.emnlp-main.441

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

Zhongwei Wan, Yichun Yin, Wei Zhang, Jiaxin Shi, Lifeng Shang, Guangyong Chen, Xin Jiang, Qun Liu

Abstract

General pre-trained language models (PLMs), such as BERT, have achieved remarkable performance on various NLP tasks. Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e.g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora. However, this domain-adaptive pre-training (DAPT (CITATION)) tends to forget the previous general knowledge acquired by general PLMs, which leads to a catastrophic forgetting phenomenon and sub-optimal performance. To alleviate this problem, we propose a new framework of Memory-Augmented Pre-trained Language Model (MAP), which augments the domain-specific PLM by a memory built from the frozen general PLM without losing the general knowledge. Specifically, we propose a new memory-augmented layer, and based on it, different augmentation strategies are explored to build memory and fusion memory into domain-specific PLM. We demonstrate the effectiveness of MAP on different domains (biomedical and computer science publications, news, and reviews) and different kinds (text classification, QA, NER) of tasks, and the extensive results show that the proposed MAP can achieve SOTA results on these tasks.

Anthology ID:: 2022.emnlp-main.441
Volume:: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6585–6597
Language:
URL:: https://aclanthology.org/2022.emnlp-main.441
DOI:: 10.18653/v1/2022.emnlp-main.441
Bibkey:
Cite (ACL):: Zhongwei Wan, Yichun Yin, Wei Zhang, Jiaxin Shi, Lifeng Shang, Guangyong Chen, Xin Jiang, and Qun Liu. 2022. G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 6585–6597, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks (Wan et al., EMNLP 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-2023-videos/2022.emnlp-main.441.pdf

PDF Search