Abstract
Pretrained multilingual encoders enable zero-shot cross-lingual transfer, but often produce unreliable models that exhibit high performance variance on the target language. We postulate that this high variance results from zero-shot cross-lingual transfer solving an under-specified optimization problem. We show that any linear-interpolated model between the source language monolingual model and source + target bilingual model has equally low source language generalization error, yet the target language generalization error reduces smoothly and linearly as we move from the monolingual to bilingual model, suggesting that the model struggles to identify good solutions for both source and target languages using the source language alone. Additionally, we show that zero-shot solution lies in non-flat region of target language error generalization surface, causing the high variance.- Anthology ID:
- 2022.repl4nlp-1.25
- Volume:
- Proceedings of the 7th Workshop on Representation Learning for NLP
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Spandana Gella, He He, Bodhisattwa Prasad Majumder, Burcu Can, Eleonora Giunchiglia, Samuel Cahyawijaya, Sewon Min, Maximilian Mozes, Xiang Lorraine Li, Isabelle Augenstein, Anna Rogers, Kyunghyun Cho, Edward Grefenstette, Laura Rimell, Chris Dyer
- Venue:
- RepL4NLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 236–248
- Language:
- URL:
- https://aclanthology.org/2022.repl4nlp-1.25
- DOI:
- 10.18653/v1/2022.repl4nlp-1.25
- Cite (ACL):
- Shijie Wu, Benjamin Van Durme, and Mark Dredze. 2022. Zero-shot Cross-lingual Transfer is Under-specified Optimization. In Proceedings of the 7th Workshop on Representation Learning for NLP, pages 236–248, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Zero-shot Cross-lingual Transfer is Under-specified Optimization (Wu et al., RepL4NLP 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.repl4nlp-1.25.pdf
- Code
- shijie-wu/crosslingual-nlp
- Data
- XNLI