AIMA at SemEval-2025 Task 1: Bridging Text and Image for Idiomatic Knowledge Extraction via Mixture of Experts

Arash Rasouli, Erfan Sadraiye, Omid Ghahroodi, Hamid Rabiee, Ehsaneddin Asgari


Abstract
Idioms are integral components of language, playing a crucial role in understanding and processing linguistic expressions. Although extensive research has been conducted on the comprehension of idioms in the text domain, their interpretation in multi-modal spaces remains largely unexplored. In this work, we propose a multi-expert framework to investigate the transfer of idiomatic knowledge from the language to the vision modality. Through a series of experiments, we demonstrate that leveraging text-based representations of idioms can significantly enhance understanding of the visual space, bridging the gap between linguistic and visual semantics.
Anthology ID:
2025.semeval-1.296
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2270–2275
Language:
URL:
https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.296/
DOI:
Bibkey:
Cite (ACL):
Arash Rasouli, Erfan Sadraiye, Omid Ghahroodi, Hamid Rabiee, and Ehsaneddin Asgari. 2025. AIMA at SemEval-2025 Task 1: Bridging Text and Image for Idiomatic Knowledge Extraction via Mixture of Experts. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2270–2275, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
AIMA at SemEval-2025 Task 1: Bridging Text and Image for Idiomatic Knowledge Extraction via Mixture of Experts (Rasouli et al., SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.296.pdf