It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

Hila Gonen; Shauli Ravfogel; Yanai Elazar; Yoav Goldberg

doi:10.18653/v1/2020.blackboxnlp-1.5

It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

Hila Gonen, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

Abstract

Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages. We study the word-level translation information embedded in mBERT and present two simple methods that expose remarkable translation capabilities with no fine-tuning. The results suggest that most of this information is encoded in a non-linear way, while some of it can also be recovered with purely linear tools. As part of our analysis, we test the hypothesis that mBERT learns representations which contain both a language-encoding component and an abstract, cross-lingual component, and explicitly identify an empirical language-identity subspace within mBERT representations.

Anthology ID:: 2020.blackboxnlp-1.5
Volume:: Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP
Month:: November
Year:: 2020
Address:: Online
Venue:: BlackboxNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 45–56
Language:
URL:: https://aclanthology.org/2020.blackboxnlp-1.5
DOI:: 10.18653/v1/2020.blackboxnlp-1.5
Bibkey:
Cite (ACL):: Hila Gonen, Shauli Ravfogel, Yanai Elazar, and Yoav Goldberg. 2020. It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT. In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 45–56, Online. Association for Computational Linguistics.
Cite (Informal):: It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT (Gonen et al., BlackboxNLP 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-script-update/2020.blackboxnlp-1.5.pdf
Code: gonenhila/mbert

PDF Search Code