Quality In, Quality Out: Learning from Actual Mistakes

Frederic Blain, Nikolaos Aletras, Lucia Specia


Abstract
Approaches to Quality Estimation (QE) of machine translation have shown promising results at predicting quality scores for translated sentences. However, QE models are often trained on noisy approximations of quality annotations derived from the proportion of post-edited words in translated sentences instead of direct human annotations of translation errors. The latter is a more reliable ground-truth but more expensive to obtain. In this paper, we present the first attempt to model the task of predicting the proportion of actual translation errors in a sentence while minimising the need for direct human annotation. For that purpose, we use transfer-learning to leverage large scale noisy annotations and small sets of high-fidelity human annotated translation errors to train QE models. Experiments on four language pairs and translations obtained by statistical and neural models show consistent gains over strong baselines.
Anthology ID:
2020.eamt-1.16
Volume:
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
Month:
November
Year:
2020
Address:
Lisboa, Portugal
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
145–153
Language:
URL:
https://aclanthology.org/2020.eamt-1.16
DOI:
Bibkey:
Cite (ACL):
Frederic Blain, Nikolaos Aletras, and Lucia Specia. 2020. Quality In, Quality Out: Learning from Actual Mistakes. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 145–153, Lisboa, Portugal. European Association for Machine Translation.
Cite (Informal):
Quality In, Quality Out: Learning from Actual Mistakes (Blain et al., EAMT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.eamt-1.16.pdf