Abstract
In this work, we customized a neural machine translation system for translation of subtitles in the domain of entertainment. The neural translation model was adapted to the subtitling content and style and extended by a simple, yet effective technique for utilizing inter-sentence context for short sentences such as dialog turns. The main contribution of the paper is a novel subtitle segmentation algorithm that predicts the end of a subtitle line given the previous word-level context using a recurrent neural network learned from human segmentation decisions. This model is combined with subtitle length and duration constraints established in the subtitling industry. We conducted a thorough human evaluation with two post-editors (English-to-Spanish translation of a documentary and a sitcom). It showed a notable productivity increase of up to 37% as compared to translating from scratch and significant reductions in human translation edit rate in comparison with the post-editing of the baseline non-adapted system without a learned segmentation model.- Anthology ID:
- W19-5209
- Volume:
- Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 82–93
- Language:
- URL:
- https://aclanthology.org/W19-5209
- DOI:
- 10.18653/v1/W19-5209
- Cite (ACL):
- Evgeny Matusov, Patrick Wilken, and Yota Georgakopoulou. 2019. Customizing Neural Machine Translation for Subtitling. In Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers), pages 82–93, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Customizing Neural Machine Translation for Subtitling (Matusov et al., WMT 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/W19-5209.pdf
- Data
- OpenSubtitles