Terra: a Collection of Translation Error-Annotated Corpora

Mark Fishel, Ondřej Bojar, Maja Popović


Abstract
Recently the first methods of automatic diagnostics of machine translation have emerged; since this area of research is relatively young, the efforts are not coordinated. We present a collection of translation error-annotated corpora, consisting of automatically produced translations and their detailed manual translation error analysis. Using the collected corpora we evaluate the available state-of-the-art methods of MT diagnostics and assess, how well the methods perform, how they compare to each other and whether they can be useful in practice.
Anthology ID:
L12-1260
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
7–14
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/481_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Mark Fishel, Ondřej Bojar, and Maja Popović. 2012. Terra: a Collection of Translation Error-Annotated Corpora. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 7–14, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Terra: a Collection of Translation Error-Annotated Corpora (Fishel et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/481_Paper.pdf