The Trials and Tribulations of Predicting Post-Editing Productivity

Lena Marg


Abstract
While an increasing number of (automatic) metrics is available to assess the linguistic quality of machine translations, their interpretation remains cryptic to many users, specifically in the translation community. They are clearly useful for indicating certain overarching trends, but say little about actual improvements for translation buyers or post-editors. However, these metrics are commonly referenced when discussing pricing and models, both with translation buyers and service providers. With the aim of focusing on automatic metrics that are easier to understand for non-research users, we identified Edit Distance (or Post-Edit Distance) as a good fit. While Edit Distance as such does not express cognitive effort or time spent editing machine translation suggestions, we found that it correlates strongly with the productivity tests we performed, for various language pairs and domains. This paper aims to analyse Edit Distance and productivity data on a segment level based on data gathered over some years. Drawing from these findings, we want to then explore how Edit Distance could help in predicting productivity on new content. Some further analysis is proposed, with findings to be presented at the conference.
Anthology ID:
L16-1004
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
23–26
Language:
URL:
https://aclanthology.org/L16-1004
DOI:
Bibkey:
Cite (ACL):
Lena Marg. 2016. The Trials and Tribulations of Predicting Post-Editing Productivity. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 23–26, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
The Trials and Tribulations of Predicting Post-Editing Productivity (Marg, LREC 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/L16-1004.pdf