Customizing Neural Machine Translation for Subtitling

Evgeny Matusov, Patrick Wilken, Yota Georgakopoulou


Abstract
In this work, we customized a neural machine translation system for translation of subtitles in the domain of entertainment. The neural translation model was adapted to the subtitling content and style and extended by a simple, yet effective technique for utilizing inter-sentence context for short sentences such as dialog turns. The main contribution of the paper is a novel subtitle segmentation algorithm that predicts the end of a subtitle line given the previous word-level context using a recurrent neural network learned from human segmentation decisions. This model is combined with subtitle length and duration constraints established in the subtitling industry. We conducted a thorough human evaluation with two post-editors (English-to-Spanish translation of a documentary and a sitcom). It showed a notable productivity increase of up to 37% as compared to translating from scratch and significant reductions in human translation edit rate in comparison with the post-editing of the baseline non-adapted system without a learned segmentation model.
Anthology ID:
W19-5209
Volume:
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
Month:
August
Year:
2019
Address:
Florence, Italy
Venues:
ACL | WMT | WS
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
82–93
Language:
URL:
https://aclanthology.org/W19-5209
DOI:
10.18653/v1/W19-5209
Bibkey:
Cite (ACL):
Evgeny Matusov, Patrick Wilken, and Yota Georgakopoulou. 2019. Customizing Neural Machine Translation for Subtitling. In Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers), pages 82–93, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Customizing Neural Machine Translation for Subtitling (Matusov et al., 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/W19-5209.pdf
Data
OpenSubtitles