On the Same Page? Comparing Inter-Annotator Agreement in Sentence and Document Level Human Machine Translation Evaluation

Sheila Castilho

On the Same Page? Comparing Inter-Annotator Agreement in Sentence and Document Level Human Machine Translation Evaluation

Abstract

Document-level evaluation of machine translation has raised interest in the community especially since responses to the claims of “human parity” (Toral et al., 2018; Läubli et al., 2018) with document-level human evaluations have been published. Yet, little is known about best practices regarding human evaluation of machine translation at the document-level. This paper presents a comparison of the differences in inter-annotator agreement between quality assessments using sentence and document-level set-ups. We report results of the agreement between professional translators for fluency and adequacy scales, error annotation, and pair-wise ranking, along with the effort needed to perform the different tasks. To best of our knowledge, this is the first study of its kind.

Anthology ID:: 2020.wmt-1.137
Volume:: Proceedings of the Fifth Conference on Machine Translation
Month:: November
Year:: 2020
Address:: Online
Editors:: Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1150–1159
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2020.wmt-1.137/
DOI:
Bibkey:
Cite (ACL):: Sheila Castilho. 2020. On the Same Page? Comparing Inter-Annotator Agreement in Sentence and Document Level Human Machine Translation Evaluation. In Proceedings of the Fifth Conference on Machine Translation, pages 1150–1159, Online. Association for Computational Linguistics.
Cite (Informal):: On the Same Page? Comparing Inter-Annotator Agreement in Sentence and Document Level Human Machine Translation Evaluation (Castilho, WMT 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2020.wmt-1.137.pdf
Video:: https://slideslive.com/38939557

PDF Cite Search Video Fix data