Abstract
Zero-shot neural machine translation (NMT) is a framework that uses source-pivot and target-pivot parallel data to train a source-target NMT system. An extension to zero-shot NMT is zero-resource NMT, which generates pseudo-parallel corpora using a zero-shot system and further trains the zero-shot system on that data. In this paper, we expand on zero-resource NMT by incorporating monolingual data in the pivot language into training; since the pivot language is usually the highest-resource language of the three, we expect monolingual pivot-language data to be most abundant. We propose methods for generating pseudo-parallel corpora using pivot-language monolingual data and for leveraging the pseudo-parallel corpora to improve the zero-shot NMT system. We evaluate these methods for a high-resource language pair (German-Russian) using English as the pivot. We show that our proposed methods yield consistent improvements over strong zero-shot and zero-resource baselines and even catch up to pivot-based models in BLEU (while not requiring the two-pass inference that pivot models require).- Anthology ID:
- D19-5610
- Volume:
- Proceedings of the 3rd Workshop on Neural Generation and Translation
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong
- Editors:
- Alexandra Birch, Andrew Finch, Hiroaki Hayashi, Ioannis Konstas, Thang Luong, Graham Neubig, Yusuke Oda, Katsuhito Sudoh
- Venue:
- NGT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 99–107
- Language:
- URL:
- https://aclanthology.org/D19-5610
- DOI:
- 10.18653/v1/D19-5610
- Cite (ACL):
- Anna Currey and Kenneth Heafield. 2019. Zero-Resource Neural Machine Translation with Monolingual Pivot Data. In Proceedings of the 3rd Workshop on Neural Generation and Translation, pages 99–107, Hong Kong. Association for Computational Linguistics.
- Cite (Informal):
- Zero-Resource Neural Machine Translation with Monolingual Pivot Data (Currey & Heafield, NGT 2019)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/D19-5610.pdf