DCSN-NLP at MWE-2026 AdMIRe 2: Bridging Literal and Figurative Meaning Through Hierarchical Multimodal Reasoning

David Cotigă, Sergiu Nisioi


Abstract
This paper presents our system for the MWE-2026 ADMiRe 2.0 shared task, which aimedto advance multimodal idiomatic understand-ing across 15 languages. We address the taskof selecting, from a set of five images, theone that best represents either the literal oridiomatic meaning of a given compound incontext. Our approach follows a multi-steppipeline: a large language model (LLM) firstdetermines whether the compound is used lit-erally or idiomatically and generates auxiliarytext, consisting of an idiomatic meaning expla-nation and a visual description of the literalmeaning. An ensemble of three CLIP modelsthen identifies the two images most semanti-cally similar to the appropriate generated textvia a voting mechanism. Finally, the LLM se-lects the best image from these two candidates.
Anthology ID:
2026.mwe-1.29
Volume:
Proceedings of the 22nd Workshop on Multiword Expressions (MWE 2026)
Month:
March
Year:
2026
Address:
Rabat, Marocco
Editors:
Atul Kr. Ojha, Verginica Barbu Mititelu, Mathieu Constant, Ivelina Stoyanova, A. Seza Doğruöz, Alexandre Rademaker
Venues:
MWE | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
217–225
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.mwe-1.29/
DOI:
Bibkey:
Cite (ACL):
David Cotigă and Sergiu Nisioi. 2026. DCSN-NLP at MWE-2026 AdMIRe 2: Bridging Literal and Figurative Meaning Through Hierarchical Multimodal Reasoning. In Proceedings of the 22nd Workshop on Multiword Expressions (MWE 2026), pages 217–225, Rabat, Marocco. Association for Computational Linguistics.
Cite (Informal):
DCSN-NLP at MWE-2026 AdMIRe 2: Bridging Literal and Figurative Meaning Through Hierarchical Multimodal Reasoning (Cotigă & Nisioi, MWE 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.mwe-1.29.pdf