Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task

Alireza Mohammadshahi; Rémi Lebret; Karl Aberer

doi:10.18653/v1/D19-6605

Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task

Alireza Mohammadshahi, Rémi Lebret, Karl Aberer

Abstract

In this paper, we propose a new approach to learn multimodal multilingual embeddings for matching images and their relevant captions in two languages. We combine two existing objective functions to make images and captions close in a joint embedding space while adapting the alignment of word embeddings between existing languages in our model. We show that our approach enables better generalization, achieving state-of-the-art performance in text-to-image and image-to-text retrieval task, and caption-caption similarity task. Two multimodal multilingual datasets are used for evaluation: Multi30k with German and English captions and Microsoft-COCO with English and Japanese captions.

Anthology ID:: D19-6605
Volume:: Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: James Thorne, Andreas Vlachos, Oana Cocarascu, Christos Christodoulopoulos, Arpit Mittal
Venue:: WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 27–33
Language:
URL:: https://preview.aclanthology.org/iwcs-25-ingestion/D19-6605/
DOI:: 10.18653/v1/D19-6605
Bibkey:
Cite (ACL):: Alireza Mohammadshahi, Rémi Lebret, and Karl Aberer. 2019. Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task. In Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER), pages 27–33, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task (Mohammadshahi et al., 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/iwcs-25-ingestion/D19-6605.pdf
Attachment:: D19-6605.Attachment.zip

PDF Cite Search Attachment Fix data