Abstract
This work proposes a new method for manual evaluation of Machine Translation (MT) output based on marking actual issues in the translated text. The novelty is that the evaluators are not assigning any scores, nor classifying errors, but marking all problematic parts (words, phrases, sentences) of the translation. The main advantage of this method is that the resulting annotations do not only provide overall scores by counting words with assigned tags, but can be further used for analysis of errors and challenging linguistic phenomena, as well as inter-annotator disagreements. Detailed analysis and understanding of actual problems are not enabled by typical manual evaluations where the annotators are asked to assign overall scores or to rank two or more translations. The proposed method is very general: it can be applied on any genre/domain and language pair, and it can be guided by various types of quality criteria. Also, it is not restricted to MT output, but can be used for other types of generated text.- Anthology ID:
- 2020.coling-main.444
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Donia Scott, Nuria Bel, Chengqing Zong
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 5059–5069
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.444
- DOI:
- 10.18653/v1/2020.coling-main.444
- Cite (ACL):
- Maja Popović. 2020. Informative Manual Evaluation of Machine Translation Output. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5059–5069, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- Informative Manual Evaluation of Machine Translation Output (Popović, COLING 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.coling-main.444.pdf
- Code
- m-popovic/qrev-annotations
- Data
- IMDb Movie Reviews