Putting Words in BERT’s Mouth: Navigating Contextualized Vector Spaces with Pseudowords
Taelin Karidi, Yichu Zhou, Nathan Schneider, Omri Abend, Vivek Srikumar
Abstract
We present a method for exploring regions around individual points in a contextualized vector space (particularly, BERT space), as a way to investigate how these regions correspond to word senses. By inducing a contextualized “pseudoword” vector as a stand-in for a static embedding in the input layer, and then performing masked prediction of a word in the sentence, we are able to investigate the geometry of the BERT-space in a controlled manner around individual instances. Using our method on a set of carefully constructed sentences targeting highly ambiguous English words, we find substantial regularity in the contextualized space, with regions that correspond to distinct word senses; but between these regions there are occasionally “sense voids”—regions that do not correspond to any intelligible sense.- Anthology ID:
- 2021.emnlp-main.806
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 10300–10313
- Language:
- URL:
- https://aclanthology.org/2021.emnlp-main.806
- DOI:
- 10.18653/v1/2021.emnlp-main.806
- Cite (ACL):
- Taelin Karidi, Yichu Zhou, Nathan Schneider, Omri Abend, and Vivek Srikumar. 2021. Putting Words in BERT’s Mouth: Navigating Contextualized Vector Spaces with Pseudowords. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10300–10313, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Putting Words in BERT’s Mouth: Navigating Contextualized Vector Spaces with Pseudowords (Karidi et al., EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2021.emnlp-main.806.pdf
- Code
- tai314159/pwibm-putting-words-in-bert-s-mouth