Abstract
As the name implies, contextualized representations of language are typically motivated by their ability to encode context. Which aspects of context are captured by such representations? We introduce an approach to address this question using Representational Similarity Analysis (RSA). As case studies, we investigate the degree to which a verb embedding encodes the verb’s subject, a pronoun embedding encodes the pronoun’s antecedent, and a full-sentence representation encodes the sentence’s head word (as determined by a dependency parse). In all cases, we show that BERT’s contextualized embeddings reflect the linguistic dependency being studied, and that BERT encodes these dependencies to a greater degree than it encodes less linguistically-salient controls. These results demonstrate the ability of our approach to adjudicate between hypotheses about which aspects of context are encoded in representations of language.- Anthology ID:
- 2020.coling-main.325
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 3637–3651
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.325
- DOI:
- 10.18653/v1/2020.coling-main.325
- Cite (ACL):
- Michael Lepori and R. Thomas McCoy. 2020. Picking BERT’s Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3637–3651, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- Picking BERT’s Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis (Lepori & McCoy, COLING 2020)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2020.coling-main.325.pdf
- Code
- mlepori1/Picking_BERTs_Brain