A Pronoun Test Suite Evaluation of the English–German MT Systems at WMT 2018

Liane Guillou, Christian Hardmeier, Ekaterina Lapshinova-Koltunski, Sharid Loáiciga


Abstract
We evaluate the output of 16 English-to-German MT systems with respect to the translation of pronouns in the context of the WMT 2018 competition. We work with a test suite specifically designed to assess system quality in various fine-grained categories known to be problematic. The main evaluation scores come from a semi-automatic process, combining automatic reference matching with extensive manual annotation of uncertain cases. We find that current NMT systems are good at translating pronouns with intra-sentential reference, but the inter-sentential cases remain difficult. NMT systems are also good at the translation of event pronouns, unlike systems from the phrase-based SMT paradigm. No single system performs best at translating all types of anaphoric pronouns, suggesting unexplained random effects influencing the translation of pronouns with NMT.
Anthology ID:
W18-6435
Volume:
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Month:
October
Year:
2018
Address:
Belgium, Brussels
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
570–577
Language:
URL:
https://aclanthology.org/W18-6435
DOI:
10.18653/v1/W18-6435
Bibkey:
Cite (ACL):
Liane Guillou, Christian Hardmeier, Ekaterina Lapshinova-Koltunski, and Sharid Loáiciga. 2018. A Pronoun Test Suite Evaluation of the English–German MT Systems at WMT 2018. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 570–577, Belgium, Brussels. Association for Computational Linguistics.
Cite (Informal):
A Pronoun Test Suite Evaluation of the English–German MT Systems at WMT 2018 (Guillou et al., WMT 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W18-6435.pdf
Data
ParCorFull