dutir914 at SemEval-2025 Task 1: An integrated approach for Multimodal Idiomaticity Representations
Yanan Wang, Dailin Li, Yicen Tian, Bo Zhang, Wang Jian, Liang Yang
Abstract
SemEval-2025 Task 1 introduces multimodal datasets for idiomatic expression representation. Subtask A focuses on ranking images based on potentially idiomatic noun compounds in given sentences. Idiom comprehension demands the fusion of visual and auditory elements with contextual semantics, yet existing datasets exhibit phrase-image discordance and culture-specific opacity, impeding cross-modal semantic alignment. To address these challenges, we propose an integrated approach that combines data augmentation and model fine-tuning in subtask A. First, we construct two idiom datasets by generating visual metaphors for idiomatic expressions to fine-tune the CLIP model. Next, We propose a three-stage multimodal chain-of-thought method, fine-tuning Qwen2.5-VL-7B-Instruct to generate rationales and perform inference, alongside zero-shot experiments with Qwen2.5-VL-72B-Instruct. Finally, we integrate the output of different models through a voting mechanism to enhance the accuracy of multimodal semantic matching. This approach achieves {textbf{0.92}} accuracy on the Portuguese test set and {textbf{0.93}} on the English test set, ranking {textbf{3rd}} and {textbf{4th}}, respectively. The implementation code is publicly available here{footnote{{url{ https://github.com/wyn1015/semeval}}}}.- Anthology ID:
- 2025.semeval-1.159
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1198–1203
- Language:
- URL:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.159/
- DOI:
- Cite (ACL):
- Yanan Wang, Dailin Li, Yicen Tian, Bo Zhang, Wang Jian, and Liang Yang. 2025. dutir914 at SemEval-2025 Task 1: An integrated approach for Multimodal Idiomaticity Representations. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1198–1203, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- dutir914 at SemEval-2025 Task 1: An integrated approach for Multimodal Idiomaticity Representations (Wang et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.159.pdf