Training Neural Machine Translation to Apply Terminology Constraints

Georgiana Dinu; Prashant Mathur; Marcello Federico; Yaser Al-Onaizan

doi:10.18653/v1/P19-1294

Training Neural Machine Translation to Apply Terminology Constraints

Georgiana Dinu, Prashant Mathur, Marcello Federico, Yaser Al-Onaizan

Abstract

This paper proposes a novel method to inject custom terminology into neural machine translation at run time. Previous works have mainly proposed modifications to the decoding algorithm in order to constrain the output to include run-time-provided target terms. While being effective, these constrained decoding methods add, however, significant computational overhead to the inference step, and, as we show in this paper, can be brittle when tested in realistic conditions. In this paper we approach the problem by training a neural MT system to learn how to use custom terminology when provided with the input. Comparative experiments show that our method is not only more effective than a state-of-the-art implementation of constrained decoding, but is also as fast as constraint-free decoding.

Anthology ID:: P19-1294
Volume:: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:: July
Year:: 2019
Address:: Florence, Italy
Editors:: Anna Korhonen, David Traum, Lluís Màrquez
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3063–3068
Language:
URL:: https://aclanthology.org/P19-1294
DOI:: 10.18653/v1/P19-1294
Bibkey:
Cite (ACL):: Georgiana Dinu, Prashant Mathur, Marcello Federico, and Yaser Al-Onaizan. 2019. Training Neural Machine Translation to Apply Terminology Constraints. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3063–3068, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: Training Neural Machine Translation to Apply Terminology Constraints (Dinu et al., ACL 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/emnlp22-frontmatter/P19-1294.pdf
Code: mtresearcher/terminology_dataset

PDF Search Code