Query Generation Using GPT-3 for CLIP-Based Word Sense Disambiguation for Image Retrieval

Xiaomeng Pan, Zhousi Chen, Mamoru Komachi


Abstract
In this study, we propose using the GPT-3 as a query generator for the backend of CLIP as an implicit word sense disambiguation (WSD) component for the SemEval 2023 shared task Visual Word Sense Disambiguation (VWSD). We confirmed previous findings — human-like prompts adapted for WSD with quotes benefit both CLIP and GPT-3, whereas plain phrases or poorly templated prompts give the worst results.
Anthology ID:
2023.starsem-1.36
Volume:
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Alexis Palmer, Jose Camacho-collados
Venue:
*SEM
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
417–422
Language:
URL:
https://aclanthology.org/2023.starsem-1.36
DOI:
10.18653/v1/2023.starsem-1.36
Bibkey:
Cite (ACL):
Xiaomeng Pan, Zhousi Chen, and Mamoru Komachi. 2023. Query Generation Using GPT-3 for CLIP-Based Word Sense Disambiguation for Image Retrieval. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pages 417–422, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Query Generation Using GPT-3 for CLIP-Based Word Sense Disambiguation for Image Retrieval (Pan et al., *SEM 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.starsem-1.36.pdf