Quantifying the Contextualization of Word Representations with Semantic Class Probing
Mengjie Zhao, Philipp Dufter, Yadollah Yaghoobzadeh, Hinrich Schütze
Abstract
Pretrained language models achieve state-of-the-art results on many NLP tasks, but there are still many open questions about how and why they work so well. We investigate the contextualization of words in BERT. We quantify the amount of contextualization, i.e., how well words are interpreted in context, by studying the extent to which semantic classes of a word can be inferred from its contextualized embedding. Quantifying contextualization helps in understanding and utilizing pretrained language models. We show that the top layer representations support highly accurate inference of semantic classes; that the strongest contextualization effects occur in the lower layers; that local context is mostly sufficient for contextualizing words; and that top layer representations are more task-specific after finetuning while lower layer representations are more transferable. Finetuning uncovers task-related features, but pretrained knowledge about contextualization is still well preserved.- Anthology ID:
- 2020.findings-emnlp.109
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2020
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Trevor Cohn, Yulan He, Yang Liu
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1219–1234
- Language:
- URL:
- https://aclanthology.org/2020.findings-emnlp.109
- DOI:
- 10.18653/v1/2020.findings-emnlp.109
- Cite (ACL):
- Mengjie Zhao, Philipp Dufter, Yadollah Yaghoobzadeh, and Hinrich Schütze. 2020. Quantifying the Contextualization of Word Representations with Semantic Class Probing. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1219–1234, Online. Association for Computational Linguistics.
- Cite (Informal):
- Quantifying the Contextualization of Word Representations with Semantic Class Probing (Zhao et al., Findings 2020)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2020.findings-emnlp.109.pdf
- Data
- GLUE, MRPC, Penn Treebank, SST, SST-2