Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation
Minghan Wang, Jiaxin Guo, Yimeng Chen, Chang Su, Min Zhang, Shimin Tao, Hao Yang
Abstract
Based on large-scale pretrained networks and the liability to be easily overfitting with limited labelled training data of multimodal translation (MMT) is a critical issue in MMT. To this end and we propose a transfer learning solution. Specifically and 1) A vanilla Transformer is pre-trained on massive bilingual text-only corpus to obtain prior knowledge; 2) A multimodal Transformer named VLTransformer is proposed with several components incorporated visual contexts; and 3) The parameters of VLTransformer are initialized with the pre-trained vanilla Transformer and then being fine-tuned on MMT tasks with a newly proposed method named cross-modal masking which forces the model to learn from both modalities. We evaluated on the Multi30k en-de and en-fr dataset and improving up to 8% BLEU score compared with the SOTA performance. The experimental result demonstrates that performing transfer learning with monomodal pre-trained NMT model on multimodal NMT tasks can obtain considerable boosts.- Anthology ID:
- 2021.mtsummit-research.12
- Volume:
- Proceedings of Machine Translation Summit XVIII: Research Track
- Month:
- August
- Year:
- 2021
- Address:
- Virtual
- Editors:
- Kevin Duh, Francisco Guzmán
- Venue:
- MTSummit
- SIG:
- Publisher:
- Association for Machine Translation in the Americas
- Note:
- Pages:
- 139–149
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2021.mtsummit-research.12/
- DOI:
- Cite (ACL):
- Minghan Wang, Jiaxin Guo, Yimeng Chen, Chang Su, Min Zhang, Shimin Tao, and Hao Yang. 2021. Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation. In Proceedings of Machine Translation Summit XVIII: Research Track, pages 139–149, Virtual. Association for Machine Translation in the Americas.
- Cite (Informal):
- Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation (Wang et al., MTSummit 2021)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2021.mtsummit-research.12.pdf
- Data
- Flickr30k