A Large-Scale Test Set for the Evaluation of Context-Aware Pronoun Translation in Neural Machine Translation

Mathias Müller; Annette Rios Gonzales; Elena Voita; Rico Sennrich

doi:10.18653/v1/W18-6307

A Large-Scale Test Set for the Evaluation of Context-Aware Pronoun Translation in Neural Machine Translation

Mathias Müller, Annette Rios, Elena Voita, Rico Sennrich

Abstract

The translation of pronouns presents a special challenge to machine translation to this day, since it often requires context outside the current sentence. Recent work on models that have access to information across sentence boundaries has seen only moderate improvements in terms of automatic evaluation metrics such as BLEU. However, metrics that quantify the overall translation quality are ill-equipped to measure gains from additional context. We argue that a different kind of evaluation is needed to assess how well models translate inter-sentential phenomena such as pronouns. This paper therefore presents a test suite of contrastive translations focused specifically on the translation of pronouns. Furthermore, we perform experiments with several context-aware models. We show that, while gains in BLEU are moderate for those systems, they outperform baselines by a large margin in terms of accuracy on our contrastive test set. Our experiments also show the effectiveness of parameter tying for multi-encoder architectures.

Anthology ID:: W18-6307
Volume:: Proceedings of the Third Conference on Machine Translation: Research Papers
Month:: October
Year:: 2018
Address:: Brussels, Belgium
Editors:: Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 61–72
Language:
URL:: https://preview.aclanthology.org/iwcs-25-ingestion/W18-6307/
DOI:: 10.18653/v1/W18-6307
Bibkey:
Cite (ACL):: Mathias Müller, Annette Rios, Elena Voita, and Rico Sennrich. 2018. A Large-Scale Test Set for the Evaluation of Context-Aware Pronoun Translation in Neural Machine Translation. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 61–72, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: A Large-Scale Test Set for the Evaluation of Context-Aware Pronoun Translation in Neural Machine Translation (Müller et al., WMT 2018)
Copy Citation:
PDF:: https://preview.aclanthology.org/iwcs-25-ingestion/W18-6307.pdf

PDF Cite Search Fix data