Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables

Weizhi Wang, Zhirui Zhang, Yichao Du, Boxing Chen, Jun Xie, Weihua Luo


Abstract
Zero-shot translation, directly translating between language pairs unseen in training, is a promising capability of multilingual neural machine translation (NMT). However, it usually suffers from capturing spurious correlations between the output language and language invariant semantics due to the maximum likelihood training objective, leading to poor transfer performance on zero-shot translation. In this paper, we introduce a denoising autoencoder objective based on pivot language into traditional training objective to improve the translation accuracy on zero-shot directions. The theoretical analysis from the perspective of latent variables shows that our approach actually implicitly maximizes the probability distributions for zero-shot directions. On two benchmark machine translation datasets, we demonstrate that the proposed method is able to effectively eliminate the spurious correlations and significantly outperforms state-of-the-art methods with a remarkable performance. Our code is available at https://github.com/Victorwz/zs-nmt-dae.
Anthology ID:
2021.findings-emnlp.366
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4321–4327
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.366
DOI:
10.18653/v1/2021.findings-emnlp.366
Bibkey:
Cite (ACL):
Weizhi Wang, Zhirui Zhang, Yichao Du, Boxing Chen, Jun Xie, and Weihua Luo. 2021. Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4321–4327, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables (Wang et al., Findings 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2021.findings-emnlp.366.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-4/2021.findings-emnlp.366.mp4
Code
 victorwz/zs-nmt-dae