Abstract
Despite impressive progress in high-resource settings, Neural Machine Translation (NMT) still struggles in low-resource and out-of-domain scenarios, often failing to match the quality of phrase-based translation. We propose a novel technique that combines back-translation and multilingual NMT to improve performance in these difficult cases. Our technique trains a single model for both directions of a language pair, allowing us to back-translate source or target monolingual data without requiring an auxiliary model. We then continue training on the augmented parallel data, enabling a cycle of improvement for a single model that can incorporate any source, target, or parallel data to improve both translation directions. As a byproduct, these models can reduce training and deployment costs significantly compared to uni-directional models. Extensive experiments show that our technique outperforms standard back-translation in low-resource scenarios, improves quality on cross-domain tasks, and effectively reduces costs across the board.- Anthology ID:
- W18-2710
- Volume:
- Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne, Australia
- Editors:
- Alexandra Birch, Andrew Finch, Thang Luong, Graham Neubig, Yusuke Oda
- Venue:
- NGT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 84–91
- Language:
- URL:
- https://aclanthology.org/W18-2710
- DOI:
- 10.18653/v1/W18-2710
- Cite (ACL):
- Xing Niu, Michael Denkowski, and Marine Carpuat. 2018. Bi-Directional Neural Machine Translation with Synthetic Parallel Data. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pages 84–91, Melbourne, Australia. Association for Computational Linguistics.
- Cite (Informal):
- Bi-Directional Neural Machine Translation with Synthetic Parallel Data (Niu et al., NGT 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/W18-2710.pdf