ImaRA: An Imaginative Frame Augmented Method for Low-Resource Multimodal Metaphor Detection and Explanation

Yuan Tian, Minzheng Wang, Nan Xu, Wenji Mao


Abstract
Multimodal metaphor detection is an important and challenging task in multimedia computing, which aims to distinguish between metaphorical and literal multimodal expressions. Existing studies mainly utilize typical multimodal computing approaches for detection, neglecting the unique cross-domain and cross-modality characteristics underlying multimodal metaphor understanding. According to Conceptual Metaphor Theory (CMT), the inconsistency between source and target domains and their attribute similarity are essential to infer the intricate meanings implied in metaphors. In practice, the scarcity of the annotated multimodal metaphorical contents in the real world brings additional difficulty to the detection task and further complicates the understanding of multimodal metaphors. To address the above challenges, in this paper, we propose a novel Imaginative FRame Augmented (ImaRA) method for low-resource multimodal metaphor detection and explanation inspired by CMT. Specifically, we first identify imaginative frame as an associative structure to stimulate the imaginative thinking of multimodal metaphor detection and understanding. We then construct a cross-modal imagination dataset rich in multimodal metaphors and corresponding imaginative frames, and retrieve an augmented instance from this imagination dataset using imaginative frames mined from the input. This augmented instance serves as the demonstration exemplar to boost the metaphor reasoning ability of the multimodal large language model (MLLM) in low-resource multimodal scenarios. Experiments on two publicly available datasets show that our method consistently achieves robust results compared to MLLM-based methods for both multimodal metaphor detection and explanation in low-resource scenarios and meanwhile surpasses existing multimodal metaphor detection methods with full training data.
Anthology ID:
2025.findings-naacl.220
Volume:
Findings of the Association for Computational Linguistics: NAACL 2025
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3953–3967
Language:
URL:
https://preview.aclanthology.org/moar-dois/2025.findings-naacl.220/
DOI:
10.18653/v1/2025.findings-naacl.220
Bibkey:
Cite (ACL):
Yuan Tian, Minzheng Wang, Nan Xu, and Wenji Mao. 2025. ImaRA: An Imaginative Frame Augmented Method for Low-Resource Multimodal Metaphor Detection and Explanation. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 3953–3967, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
ImaRA: An Imaginative Frame Augmented Method for Low-Resource Multimodal Metaphor Detection and Explanation (Tian et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/moar-dois/2025.findings-naacl.220.pdf