CUNI System for the WMT18 Multimodal Translation Task

Jindřich Helcl, Jindřich Libovický, Dušan Variš


Abstract
We present our submission to the WMT18 Multimodal Translation Task. The main feature of our submission is applying a self-attentive network instead of a recurrent neural network. We evaluate two methods of incorporating the visual features in the model: first, we include the image representation as another input to the network; second, we train the model to predict the visual features and use it as an auxiliary objective. For our submission, we acquired both textual and multimodal additional data. Both of the proposed methods yield significant improvements over recurrent networks and self-attentive textual baselines.
Anthology ID:
W18-6441
Volume:
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Month:
October
Year:
2018
Address:
Belgium, Brussels
Editors:
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
616–623
Language:
URL:
https://aclanthology.org/W18-6441
DOI:
10.18653/v1/W18-6441
Bibkey:
Cite (ACL):
Jindřich Helcl, Jindřich Libovický, and Dušan Variš. 2018. CUNI System for the WMT18 Multimodal Translation Task. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 616–623, Belgium, Brussels. Association for Computational Linguistics.
Cite (Informal):
CUNI System for the WMT18 Multimodal Translation Task (Helcl et al., WMT 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/W18-6441.pdf
Data
MS COCO