Train No Evil: Selective Masking for Task-Guided Pre-Training

Yuxian Gu; Zhengyan Zhang; Xiaozhi Wang; Zhiyuan Liu; Maosong Sun

doi:10.18653/v1/2020.emnlp-main.566

Train No Evil: Selective Masking for Task-Guided Pre-Training

Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun

Abstract

Recently, pre-trained language models mostly follow the pre-train-then-fine-tuning paradigm and have achieved great performance on various downstream tasks. However, since the pre-training stage is typically task-agnostic and the fine-tuning stage usually suffers from insufficient supervised data, the models cannot always well capture the domain-specific and task-specific patterns. In this paper, we propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning. In this stage, the model is trained by masked language modeling on in-domain unsupervised data to learn domain-specific patterns and we propose a novel selective masking strategy to learn task-specific patterns. Specifically, we design a method to measure the importance of each token in sequences and selectively mask the important tokens. Experimental results on two sentiment analysis tasks show that our method can achieve comparable or even better performance with less than 50% of computation cost, which indicates our method is both effective and efficient. The source code of this paper can be obtained from https://github.com/thunlp/SelectiveMasking.

Anthology ID:: 2020.emnlp-main.566
Volume:: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:: November
Year:: 2020
Address:: Online
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6966–6974
Language:
URL:: https://aclanthology.org/2020.emnlp-main.566
DOI:: 10.18653/v1/2020.emnlp-main.566
Bibkey:
Cite (ACL):: Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang, Zhiyuan Liu, and Maosong Sun. 2020. Train No Evil: Selective Masking for Task-Guided Pre-Training. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6966–6974, Online. Association for Computational Linguistics.
Cite (Informal):: Train No Evil: Selective Masking for Task-Guided Pre-Training (Gu et al., EMNLP 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/starsem-semeval-split/2020.emnlp-main.566.pdf
Video:: https://slideslive.com/38938884
Code: thunlp/SelectiveMasking
Data: BookCorpus

PDF Search Code Video