Abstract
It has been argued that BERT “rediscovers the traditional NLP pipeline”, with lower layers extracting morphosyntactic features and higher layers creating holistic sentence-level representations. In this paper, we critically examine this assumption through a principle-component-guided analysis, extracing sets of inputs that correspond to specific activation patterns in BERT sentence representations. We find that even in higher layers, the model mostly picks up on a variegated bunch of low-level features, many related to sentence complexity, that presumably arise from its specific pre-training objectives.- Anthology ID:
- 2023.iwcs-1.12
- Volume:
- Proceedings of the 15th International Conference on Computational Semantics
- Month:
- June
- Year:
- 2023
- Address:
- Nancy, France
- Editors:
- Maxime Amblard, Ellen Breitholtz
- Venue:
- IWCS
- SIG:
- SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 99–105
- Language:
- URL:
- https://aclanthology.org/2023.iwcs-1.12
- DOI:
- Cite (ACL):
- Dmitry Nikolaev and Sebastian Padó. 2023. The Universe of Utterances According to BERT. In Proceedings of the 15th International Conference on Computational Semantics, pages 99–105, Nancy, France. Association for Computational Linguistics.
- Cite (Informal):
- The Universe of Utterances According to BERT (Nikolaev & Padó, IWCS 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2023.iwcs-1.12.pdf