Distilling Multiple Domains for Neural Machine Translation

Anna Currey, Prashant Mathur, Georgiana Dinu


Abstract
Neural machine translation achieves impressive results in high-resource conditions, but performance often suffers when the input domain is low-resource. The standard practice of adapting a separate model for each domain of interest does not scale well in practice from both a quality perspective (brittleness under domain shift) as well as a cost perspective (added maintenance and inference complexity). In this paper, we propose a framework for training a single multi-domain neural machine translation model that is able to translate several domains without increasing inference time or memory usage. We show that this model can improve translation on both high- and low-resource domains over strong multi-domain baselines. In addition, our proposed model is effective when domain labels are unknown during training, as well as robust under noisy data conditions.
Anthology ID:
2020.emnlp-main.364
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4500–4511
Language:
URL:
https://aclanthology.org/2020.emnlp-main.364
DOI:
10.18653/v1/2020.emnlp-main.364
Bibkey:
Cite (ACL):
Anna Currey, Prashant Mathur, and Georgiana Dinu. 2020. Distilling Multiple Domains for Neural Machine Translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4500–4511, Online. Association for Computational Linguistics.
Cite (Informal):
Distilling Multiple Domains for Neural Machine Translation (Currey et al., EMNLP 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.emnlp-main.364.pdf
Video:
 https://slideslive.com/38939168