FJWU_Squad at SemEval-2025 Task 1: An Idiom Visual Understanding Dataset for Idiom Learning
Maira Khatoon, Arooj Kiyani, Tehmina Farid, Sadaf Abdul Rauf
Abstract
Idiomatic expressions pose difficulties for Natural Language Processing (NLP) because they are noncompositional. In this paper, we propose the Idiom Visual Understanding Dataset (IVUD), a multimodal dataset for idiom understanding using visual and textual representation. For SemEval-2025 Task 1 (AdMIRe), we specifically addressed dataset augmentation using AI-synthesized images and human-directed prompt engineering. We compared the efficacy of vision- and text-based models in ranking images aligned with idiomatic phrases. The results identify the advantages of using multimodal context for enhanced idiom understanding, showcasing how vision-language models perform better than text-only approaches in the detection of idiomaticity.- Anthology ID:
- 2025.semeval-1.231
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1759–1765
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.231/
- DOI:
- Cite (ACL):
- Maira Khatoon, Arooj Kiyani, Tehmina Farid, and Sadaf Abdul Rauf. 2025. FJWU_Squad at SemEval-2025 Task 1: An Idiom Visual Understanding Dataset for Idiom Learning. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1759–1765, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- FJWU_Squad at SemEval-2025 Task 1: An Idiom Visual Understanding Dataset for Idiom Learning (Khatoon et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.231.pdf