Pre-Training Transformers as Energy-Based Cloze Models
Kevin Clark, Minh-Thang Luong, Quoc Le, Christopher D. Manning
Abstract
We introduce Electric, an energy-based cloze model for representation learning over text. Like BERT, it is a conditional generative model of tokens given their contexts. However, Electric does not use masking or output a full distribution over tokens that could occur in a context. Instead, it assigns a scalar energy score to each input token indicating how likely it is given its context. We train Electric using an algorithm based on noise-contrastive estimation and elucidate how this learning objective is closely related to the recently proposed ELECTRA pre-training method. Electric performs well when transferred to downstream tasks and is particularly effective at producing likelihood scores for text: it re-ranks speech recognition n-best lists better than language models and much faster than masked language models. Furthermore, it offers a clearer and more principled view of what ELECTRA learns during pre-training.- Anthology ID:
- 2020.emnlp-main.20
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 285–294
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.20
- DOI:
- 10.18653/v1/2020.emnlp-main.20
- Cite (ACL):
- Kevin Clark, Minh-Thang Luong, Quoc Le, and Christopher D. Manning. 2020. Pre-Training Transformers as Energy-Based Cloze Models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 285–294, Online. Association for Computational Linguistics.
- Cite (Informal):
- Pre-Training Transformers as Energy-Based Cloze Models (Clark et al., EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2020.emnlp-main.20.pdf
- Code
- google-research/electra
- Data
- GLUE, LibriSpeech, OpenWebText, WebText