ITUNLP2 at MWE-2026 AdMIRe 2: Modular Zero-Shot Pipelines for Multimodal Idiom Grounding and Ranking

Özge Umut, Bora Şenceylan


Abstract
We describe a zero-shot system for AdMIRe 2.0, a shared task on multimodal understanding of potentially idiomatic expressions (PIEs). Given a context sentence with a PIE and five candidate images, the system predicts whether the usage is literal or idiomatic and ranks images by how well they match the intended meaning. We use closed-source large multimodal models and compare prompting pipelines from direct one-step ranking to modular multi-step pipelines that separate sense prediction, PIE-focused image semantics, and final ranking. All steps produce constrained JSON outputs to enable deterministic parsing and composition. In the official AdMIRe 2.0 evaluation on CodaBench, our best pipeline achieves an average Top-1 accuracy of 0.52 and an average nDCG score of 0.70 across the 12 languages we submitted. We obtain the best score among submitted systems in 10 of these languages.
Anthology ID:
2026.mwe-1.32
Volume:
Proceedings of the 22nd Workshop on Multiword Expressions (MWE 2026)
Month:
March
Year:
2026
Address:
Rabat, Marocco
Editors:
Atul Kr. Ojha, Verginica Barbu Mititelu, Mathieu Constant, Ivelina Stoyanova, A. Seza Doğruöz, Alexandre Rademaker
Venues:
MWE | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
248–253
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.mwe-1.32/
DOI:
Bibkey:
Cite (ACL):
Özge Umut and Bora Şenceylan. 2026. ITUNLP2 at MWE-2026 AdMIRe 2: Modular Zero-Shot Pipelines for Multimodal Idiom Grounding and Ranking. In Proceedings of the 22nd Workshop on Multiword Expressions (MWE 2026), pages 248–253, Rabat, Marocco. Association for Computational Linguistics.
Cite (Informal):
ITUNLP2 at MWE-2026 AdMIRe 2: Modular Zero-Shot Pipelines for Multimodal Idiom Grounding and Ranking (Umut & Şenceylan, MWE 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.mwe-1.32.pdf