Abstract
In this paper, we present a freely available corpus of automatic translations accompanied with post-edited versions, annotated with labels identifying the different kinds of errors made by the MT system. These data have been extracted from translation students exercises that have been corrected by a senior professor. This corpus can be useful for training quality estimation tools and for analyzing the types of errors made MT system.- Anthology ID:
- L14-1085
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 3585–3588
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/1115_Paper.pdf
- DOI:
- Cite (ACL):
- Guillaume Wisniewski, Natalie Kübler, and François Yvon. 2014. A Corpus of Machine Translation Errors Extracted from Translation Students Exercises. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3585–3588, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- A Corpus of Machine Translation Errors Extracted from Translation Students Exercises (Wisniewski et al., LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/1115_Paper.pdf