Abstract
This paper investigates contextual language models, which produce token representations, as a resource for lexical semantics at the word or type level. We construct multi-prototype word embeddings from bert-base-uncased (Devlin et al., 2018). These embeddings retain contextual knowledge that is critical for some type-level tasks, while being less cumbersome and less subject to outlier effects than exemplar models. Similarity and relatedness estimation, both type-level tasks, benefit from this contextual knowledge, indicating the context-sensitivity of these processes. BERT’s token level knowledge also allows the testing of a type-level hypothesis about lexical abstractness, demonstrating the relationship between token-level phenomena and type-level concreteness ratings. Our findings provide important insight into the interpretability of BERT: layer 7 approximates semantic similarity, while the final layer (11) approximates relatedness.- Anthology ID:
- 2020.conll-1.17
- Volume:
- Proceedings of the 24th Conference on Computational Natural Language Learning
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Raquel Fernández, Tal Linzen
- Venue:
- CoNLL
- SIG:
- SIGNLL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 227–244
- Language:
- URL:
- https://aclanthology.org/2020.conll-1.17
- DOI:
- 10.18653/v1/2020.conll-1.17
- Cite (ACL):
- Gabriella Chronis and Katrin Erk. 2020. When is a bishop not like a rook? When it’s like a rabbi! Multi-prototype BERT embeddings for estimating semantic relationships. In Proceedings of the 24th Conference on Computational Natural Language Learning, pages 227–244, Online. Association for Computational Linguistics.
- Cite (Informal):
- When is a bishop not like a rook? When it’s like a rabbi! Multi-prototype BERT embeddings for estimating semantic relationships (Chronis & Erk, CoNLL 2020)
- PDF:
- https://preview.aclanthology.org/emnlp-22-attachments/2020.conll-1.17.pdf
- Code
- gchronis/mprobert