Abstract
A recent body of work has demonstrated that Transformer embeddings can be linearly decomposed into well-defined sums of factors, that can in turn be related to specific network inputs or components. There is however still a dearth of work studying whether these mathematical reformulations are empirically meaningful. In the present work, we study representations from machine-translation decoders using two of such embedding decomposition methods. Our results indicate that, while decomposition-derived indicators effectively correlate with model performance, variation across different runs suggests a more nuanced take on this question. The high variability of our measurements indicate that geometry reflects model-specific characteristics more than it does sentence-specific computations, and that similar training conditions do not guarantee similar vector spaces.- Anthology ID:
- 2023.blackboxnlp-1.10
- Volume:
- Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Yonatan Belinkov, Sophie Hao, Jaap Jumelet, Najoung Kim, Arya McCarthy, Hosein Mohebbi
- Venues:
- BlackboxNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 127–141
- Language:
- URL:
- https://aclanthology.org/2023.blackboxnlp-1.10
- DOI:
- 10.18653/v1/2023.blackboxnlp-1.10
- Cite (ACL):
- Timothee Mickus and Raúl Vázquez. 2023. Why Bother with Geometry? On the Relevance of Linear Decompositions of Transformer Embeddings. In Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 127–141, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Why Bother with Geometry? On the Relevance of Linear Decompositions of Transformer Embeddings (Mickus & Vázquez, BlackboxNLP-WS 2023)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2023.blackboxnlp-1.10.pdf