Abstract
Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages. We study the word-level translation information embedded in mBERT and present two simple methods that expose remarkable translation capabilities with no fine-tuning. The results suggest that most of this information is encoded in a non-linear way, while some of it can also be recovered with purely linear tools. As part of our analysis, we test the hypothesis that mBERT learns representations which contain both a language-encoding component and an abstract, cross-lingual component, and explicitly identify an empirical language-identity subspace within mBERT representations.- Anthology ID:
- 2020.blackboxnlp-1.5
- Volume:
- Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Afra Alishahi, Yonatan Belinkov, Grzegorz Chrupała, Dieuwke Hupkes, Yuval Pinter, Hassan Sajjad
- Venue:
- BlackboxNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 45–56
- Language:
- URL:
- https://aclanthology.org/2020.blackboxnlp-1.5
- DOI:
- 10.18653/v1/2020.blackboxnlp-1.5
- Cite (ACL):
- Hila Gonen, Shauli Ravfogel, Yanai Elazar, and Yoav Goldberg. 2020. It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT. In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 45–56, Online. Association for Computational Linguistics.
- Cite (Informal):
- It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT (Gonen et al., BlackboxNLP 2020)
- PDF:
- https://preview.aclanthology.org/fix-volume-bibkeys/2020.blackboxnlp-1.5.pdf
- Code
- gonenhila/mbert