Mohammad Ali Sadraei Javaheri


2023

This paper presents an approach to tackle the task of Visual Word Sense Disambiguation (Visual-WSD), which involves determining the most appropriate image to represent a given polysemous word in one of its particular senses. The proposed approach leverages the CLIP model, prompt engineering, and text-to-image models such as GLIDE and DALL-E 2 for both image retrieval and generation. To evaluate our approach, we participated in the SemEval 2023 shared task on “Visual Word Sense Disambiguation (Visual-WSD)” using a zero-shot learning setting, where we compared the accuracy of different combinations of tools, including “Simple prompt-based” methods and “Generated prompt-based” methods for prompt engineering using completion models, and text-to-image models for changing input modality from text to image. Moreover, we explored the benefits of cross-modality evaluation between text and candidate images using CLIP. Our experimental results demonstrate that the proposed approach reaches better results than cross-modality approaches, highlighting the potential of prompt engineering and text-to-image models to improve accuracy in Visual-WSD tasks. We assessed our approach in a zero-shot learning scenario and attained an accuracy of 68.75\% in our best attempt.
The human values expressed in argumentative texts can provide valuable insights into the culture of a society. They can be helpful in various applications such as value-based profiling and ethical analysis. However, one of the first steps in achieving this goal is to detect the category of human value from an argument accurately. This task is challenging due to the lack of data and the need for philosophical inference. It also can be challenging for humans to classify arguments according to their underlying human values. This paper elaborates on our model for the SemEval 2023 Task 4 on human value detection. We propose a class-token attention-based model and evaluate it against baseline models, including finetuned BERT language model and a keyword-based approach.