Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation

Minghan Wang; Jiaxin Guo; Yimeng Chen; Chang Su; Min Zhang; Shimin Tao; Hao Yang

Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation

Minghan Wang, Jiaxin Guo, Yimeng Chen, Chang Su, Min Zhang, Shimin Tao, Hao Yang

Abstract

Based on large-scale pretrained networks and the liability to be easily overfitting with limited labelled training data of multimodal translation (MMT) is a critical issue in MMT. To this end and we propose a transfer learning solution. Specifically and 1) A vanilla Transformer is pre-trained on massive bilingual text-only corpus to obtain prior knowledge; 2) A multimodal Transformer named VLTransformer is proposed with several components incorporated visual contexts; and 3) The parameters of VLTransformer are initialized with the pre-trained vanilla Transformer and then being fine-tuned on MMT tasks with a newly proposed method named cross-modal masking which forces the model to learn from both modalities. We evaluated on the Multi30k en-de and en-fr dataset and improving up to 8% BLEU score compared with the SOTA performance. The experimental result demonstrates that performing transfer learning with monomodal pre-trained NMT model on multimodal NMT tasks can obtain considerable boosts.

Anthology ID:: 2021.mtsummit-research.12
Volume:: Proceedings of Machine Translation Summit XVIII: Research Track
Month:: August
Year:: 2021
Address:: Virtual
Editors:: Kevin Duh, Francisco Guzmán
Venue:: MTSummit
SIG:
Publisher:: Association for Machine Translation in the Americas
Note:
Pages:: 139–149
Language:
URL:: https://aclanthology.org/2021.mtsummit-research.12
DOI:
Bibkey:
Cite (ACL):: Minghan Wang, Jiaxin Guo, Yimeng Chen, Chang Su, Min Zhang, Shimin Tao, and Hao Yang. 2021. Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation. In Proceedings of Machine Translation Summit XVIII: Research Track, pages 139–149, Virtual. Association for Machine Translation in the Americas.
Cite (Informal):: Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation (Wang et al., MTSummit 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-2/2021.mtsummit-research.12.pdf
Data: Flickr30k

PDF Search