Error profiling for evaluation of machine-translated text: a Polish-English case study

Sandra Weiss, Lars Ahrenberg


Abstract
We present a study of Polish-English machine translation, where the impact of various types of errors on cohesion and comprehensibility of the translations were investigated. The following phenomena are in focus: (i) The most common errors produced by current state-of-the-art MT systems for Polish-English MT. (ii) The effect of different types of errors on text cohesion. (iii) The effect of different types of errors on readers' understanding of the translation. We found that errors of incorrect and missing translations are the most common for current systems, while the category of non-translated words had the most negative impact on comprehension. All three of these categories contributed to the breaking of cohesive chains. The correlation between number of errors found in a translation and number of wrong answers in the comprehension tests was low. Another result was that non-native speakers of English performed at least as good as native speakers on the comprehension tests.
Anthology ID:
L12-1430
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1764–1770
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/736_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Sandra Weiss and Lars Ahrenberg. 2012. Error profiling for evaluation of machine-translated text: a Polish-English case study. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1764–1770, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Error profiling for evaluation of machine-translated text: a Polish-English case study (Weiss & Ahrenberg, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/736_Paper.pdf