Towards End-to-End In-Image Neural Machine Translation

Elman Mansimov, Mitchell Stern, Mia Chen, Orhan Firat, Jakob Uszkoreit, Puneet Jain


Abstract
In this paper, we offer a preliminary investigation into the task of in-image machine translation: transforming an image containing text in one language into an image containing the same text in another language. We propose an end-to-end neural model for this task inspired by recent approaches to neural machine translation, and demonstrate promising initial results based purely on pixel-level supervision. We then offer a quantitative and qualitative evaluation of our system outputs and discuss some common failure modes. Finally, we conclude with directions for future work.
Anthology ID:
2020.nlpbt-1.8
Volume:
Proceedings of the First International Workshop on Natural Language Processing Beyond Text
Month:
November
Year:
2020
Address:
Online
Venue:
nlpbt
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
70–74
Language:
URL:
https://aclanthology.org/2020.nlpbt-1.8
DOI:
10.18653/v1/2020.nlpbt-1.8
Bibkey:
Cite (ACL):
Elman Mansimov, Mitchell Stern, Mia Chen, Orhan Firat, Jakob Uszkoreit, and Puneet Jain. 2020. Towards End-to-End In-Image Neural Machine Translation. In Proceedings of the First International Workshop on Natural Language Processing Beyond Text, pages 70–74, Online. Association for Computational Linguistics.
Cite (Informal):
Towards End-to-End In-Image Neural Machine Translation (Mansimov et al., nlpbt 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.nlpbt-1.8.pdf
Optional supplementary material:
 2020.nlpbt-1.8.OptionalSupplementaryMaterial.pdf
Video:
 https://slideslive.com/38939782