Machine Translation Evaluation beyond the Sentence Level

Jindřich Libovický, Thomas Brovelli, Bruno Cartoni


Abstract
Automatic machine translation evaluation was crucial for the rapid development of machine translation systems over the last two decades. So far, most attention has been paid to the evaluation metrics that work with text on the sentence level and so did the translation systems. Across-sentence translation quality depends on discourse phenomena that may not manifest at all when staying within sentence boundaries (e.g. coreference, discourse connectives, verb tense sequence etc.). To tackle this, we propose several document-level MT evaluation metrics: generalizations of sentence-level metrics, language-(pair)-independent versions of lexical cohesion scores and coreference and morphology preservation in the target texts. We measure their agreement with human judgment on a newly created dataset of pairwise paragraph comparisons for four language pairs.
Anthology ID:
2018.eamt-main.18
Volume:
Proceedings of the 21st Annual Conference of the European Association for Machine Translation
Month:
May
Year:
2018
Address:
Alicante, Spain
Editors:
Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Miquel Esplà-Gomis, Maja Popović, Celia Rico, André Martins, Joachim Van den Bogaert, Mikel L. Forcada
Venue:
EAMT
SIG:
Publisher:
Note:
Pages:
199–208
Language:
URL:
https://aclanthology.org/2018.eamt-main.18
DOI:
Bibkey:
Cite (ACL):
Jindřich Libovický, Thomas Brovelli, and Bruno Cartoni. 2018. Machine Translation Evaluation beyond the Sentence Level. In Proceedings of the 21st Annual Conference of the European Association for Machine Translation, pages 199–208, Alicante, Spain.
Cite (Informal):
Machine Translation Evaluation beyond the Sentence Level (Libovický et al., EAMT 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2018.eamt-main.18.pdf