Machine Translation for Subtitling: A Large-Scale Evaluation
Thierry Etchegoyhen, Lindsay Bywood, Mark Fishel, Panayota Georgakopoulou, Jie Jiang, Gerard van Loenhout, Arantza del Pozo, Mirjam Sepesy Maučec, Anja Turner, Martin Volk
Abstract
This article describes a large-scale evaluation of the use of Statistical Machine Translation for professional subtitling. The work was carried out within the FP7 EU-funded project SUMAT and involved two rounds of evaluation: a quality evaluation and a measure of productivity gain/loss. We present the SMT systems built for the project and the corpora they were trained on, which combine professionally created and crowd-sourced data. Evaluation goals, methodology and results are presented for the eleven translation pairs that were evaluated by professional subtitlers. Overall, a majority of the machine translated subtitles received good quality ratings. The results were also positive in terms of productivity, with a global gain approaching 40%. We also evaluated the impact of applying quality estimation and filtering of poor MT output, which resulted in higher productivity gains for filtered files as opposed to fully machine-translated files. Finally, we present and discuss feedback from the subtitlers who participated in the evaluation, a key aspect for any eventual adoption of machine translation technology in professional subtitling.- Anthology ID:
- L14-1392
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 46–53
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/463_Paper.pdf
- DOI:
- Cite (ACL):
- Thierry Etchegoyhen, Lindsay Bywood, Mark Fishel, Panayota Georgakopoulou, Jie Jiang, Gerard van Loenhout, Arantza del Pozo, Mirjam Sepesy Maučec, Anja Turner, and Martin Volk. 2014. Machine Translation for Subtitling: A Large-Scale Evaluation. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 46–53, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- Machine Translation for Subtitling: A Large-Scale Evaluation (Etchegoyhen et al., LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/463_Paper.pdf