ECNU_MIV at SemEval-2023 Task 1: CTIM - Contrastive Text-Image Model for Multilingual Visual Word Sense Disambiguation

Zhenghui Li, Qi Zhang, Xueyin Xia, Yinxiang Ye, Qi Zhang, Cong Huang


Abstract
Our team focuses on the multimodal domain of images and texts, we propose a model that can learn the matching relationship between text-image pairs by contrastive learning. More specifically, We train the model from the labeled data provided by the official organizer, after pre-training, texts are used to reference learned visual concepts enabling visual word sense disambiguation tasks. In addition, the top results our teams get have been released showing the effectiveness of our solution.
Anthology ID:
2023.semeval-1.13
Volume:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
101–107
Language:
URL:
https://aclanthology.org/2023.semeval-1.13
DOI:
10.18653/v1/2023.semeval-1.13
Bibkey:
Cite (ACL):
Zhenghui Li, Qi Zhang, Xueyin Xia, Yinxiang Ye, Qi Zhang, and Cong Huang. 2023. ECNU_MIV at SemEval-2023 Task 1: CTIM - Contrastive Text-Image Model for Multilingual Visual Word Sense Disambiguation. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 101–107, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
ECNU_MIV at SemEval-2023 Task 1: CTIM - Contrastive Text-Image Model for Multilingual Visual Word Sense Disambiguation (Li et al., SemEval 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.semeval-1.13.pdf