@inproceedings{singhal-etal-2019-learning,
    title = "Learning Multilingual Word Embeddings Using Image-Text Data",
    author = "Singhal, Karan  and
      Raman, Karthik  and
      ten Cate, Balder",
    editor = "Bernardi, Raffaella  and
      Fernandez, Raquel  and
      Gella, Spandana  and
      Kafle, Kushal  and
      Kanan, Christopher  and
      Lee, Stefan  and
      Nabi, Moin",
    booktitle = "Proceedings of the Second Workshop on Shortcomings in Vision and Language",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/iwcs-25-ingestion/W19-1807/",
    doi = "10.18653/v1/W19-1807",
    pages = "68--77",
    abstract = "There has been significant interest recently in learning multilingual word embeddings {--} in which semantically similar words across languages have similar embeddings. State-of-the-art approaches have relied on expensive labeled data, which is unavailable for low-resource languages, or have involved post-hoc unification of monolingual embeddings. In the present paper, we investigate the efficacy of multilingual embeddings learned from weakly-supervised image-text data. In particular, we propose methods for learning multilingual embeddings using image-text data, by enforcing similarity between the representations of the image and that of the text. Our experiments reveal that even without using any expensive labeled data, a bag-of-words-based embedding model trained on image-text data achieves performance comparable to the state-of-the-art on crosslingual semantic similarity tasks."
}Markdown (Informal)
[Learning Multilingual Word Embeddings Using Image-Text Data](https://preview.aclanthology.org/iwcs-25-ingestion/W19-1807/) (Singhal et al., NAACL 2019)
ACL