Abstract
Pretrained contextualized text encoders are now a staple of the NLP community. We present a survey on language representation learning with the aim of consolidating a series of shared lessons learned across a variety of recent efforts. While significant advancements continue at a rapid pace, we find that enough has now been discovered, in different directions, that we can begin to organize advances according to common themes. Through this organization, we highlight important considerations when interpreting recent contributions and choosing which model to use.- Anthology ID:
- 2020.emnlp-main.608
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7516–7533
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.608
- DOI:
- 10.18653/v1/2020.emnlp-main.608
- Cite (ACL):
- Patrick Xia, Shijie Wu, and Benjamin Van Durme. 2020. Which *BERT? A Survey Organizing Contextualized Encoders. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7516–7533, Online. Association for Computational Linguistics.
- Cite (Informal):
- Which *BERT? A Survey Organizing Contextualized Encoders (Xia et al., EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2020.emnlp-main.608.pdf
- Data
- GLUE, SQuAD, SuperGLUE