PMCoders at SemEval-2023 Task 1: RAltCLIP: Use Relative AltCLIP Features to Rank
Mohammad Javad Pirhadi, Motahhare Mirzaei, Mohammad Reza Mohammadi, Sauleh Eetemadi
Abstract
Visual Word Sense Disambiguation (VWSD) task aims to find the most related image among 10 images to an ambiguous word in some limited textual context. In this work, we use AltCLIP features and a 3-layer standard transformer encoder to compare the cosine similarity between the given phrase and different images. Also, we improve our model’s generalization by using a subset of LAION-5B. The best official baseline achieves 37.20% and 54.39% macro-averaged hit rate and MRR (Mean Reciprocal Rank) respectively. Our best configuration reaches 39.61% and 56.78% macro-averaged hit rate and MRR respectively. The code will be made publicly available on GitHub.- Anthology ID:
- 2023.semeval-1.242
- Volume:
- Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1751–1755
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2023.semeval-1.242/
- DOI:
- 10.18653/v1/2023.semeval-1.242
- Cite (ACL):
- Mohammad Javad Pirhadi, Motahhare Mirzaei, Mohammad Reza Mohammadi, and Sauleh Eetemadi. 2023. PMCoders at SemEval-2023 Task 1: RAltCLIP: Use Relative AltCLIP Features to Rank. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 1751–1755, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- PMCoders at SemEval-2023 Task 1: RAltCLIP: Use Relative AltCLIP Features to Rank (Pirhadi et al., SemEval 2023)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2023.semeval-1.242.pdf