When Does Unsupervised Machine Translation Work?

Kelly Marchisio; Kevin Duh; Philipp Koehn

When Does Unsupervised Machine Translation Work?

Kelly Marchisio, Kevin Duh, Philipp Koehn

Abstract

Despite the reported success of unsupervised machine translation (MT), the field has yet to examine the conditions under which the methods succeed and fail. We conduct an extensive empirical evaluation using dissimilar language pairs, dissimilar domains, and diverse datasets. We find that performance rapidly deteriorates when source and target corpora are from different domains, and that stochasticity during embedding training can dramatically affect downstream results. We additionally find that unsupervised MT performance declines when source and target languages use different scripts, and observe very poor performance on authentic low-resource language pairs. We advocate for extensive empirical evaluation of unsupervised MT systems to highlight failure points and encourage continued research on the most promising paradigms. We release our preprocessed dataset to encourage evaluations that stress-test systems under multiple data conditions.

Anthology ID:: 2020.wmt-1.68
Volume:: Proceedings of the Fifth Conference on Machine Translation
Month:: November
Year:: 2020
Address:: Online
Venues:: EMNLP | WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 571–583
Language:
URL:: https://aclanthology.org/2020.wmt-1.68
DOI:
Bibkey:
Cite (ACL):: Kelly Marchisio, Kevin Duh, and Philipp Koehn. 2020. When Does Unsupervised Machine Translation Work?. In Proceedings of the Fifth Conference on Machine Translation, pages 571–583, Online. Association for Computational Linguistics.
Cite (Informal):: When Does Unsupervised Machine Translation Work? (Marchisio et al., WMT 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/update-css-js/2020.wmt-1.68.pdf
Video:: https://slideslive.com/38939554

PDF Cite Search Video