Keyi Li

2023

pdf bib abs
Rutgers Multimedia Image Processing Lab at SemEval-2023 Task-1: Text-Augmentation-based Approach for Visual Word Sense Disambiguation
Keyi Li | Sen Yang | Chenyang Gao | Ivan Marsic
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes our system used in SemEval-2023 Task-1: Visual Word Sense Disambiguation (VWSD). The VWSD task is to identify the correct image that corresponds to an ambiguous target word given limited textual context. To reduce word ambiguity and enhance image selection, we proposed several text augmentation techniques, such as prompting, WordNet synonyms, and text generation. We experimented with different vision-language pre-trained models to capture the joint features of the augmented text and image. Our approach achieved the best performance using a combination of GPT-3 text generation and the CLIP model. On the multilingual test sets, our system achieved an average hit rate (at top-1) of 51.11 and a mean reciprocal rank of 65.69.

Co-authors

Venues

semeval1

Fix data

Keyi Li

Fixing paper assignments

2023

Co-authors

Venues