MultiCMET: A Novel Chinese Benchmark for Understanding Multimodal Metaphor

Dongyu Zhang, Jingwei Yu, Senyuan Jin, Liang Yang, Hongfei Lin


Abstract
Metaphor is a pervasive aspect of human communication, and its presence in multimodal forms has become more prominent with the progress of mass media. However, there is limited research on multimodal metaphor resources beyond the English language. Furthermore, the existing work in natural language processing does not address the exploration of categorizing the source and target domains in metaphors. This omission is significant considering the extensive research conducted in the fields of cognitive linguistics, which emphasizes that a profound understanding of metaphor relies on recognizing the differences and similarities between domain categories. We, therefore, introduce MultiCMET, a multimodal Chinese metaphor dataset, consisting of 13,820 text-image pairs of advertisements with manual annotations of the occurrence of metaphors, domain categories, and sentiments metaphors convey. We also constructed a domain lexicon that encompasses categorizations of metaphorical source domains and target domains and propose a Cascading Domain Knowledge Integration (CDKI) benchmark to detect metaphors by introducing domain-specific lexical features. Experimental results demonstrate the effectiveness of CDKI. The dataset and code are publicly available.
Anthology ID:
2023.findings-emnlp.409
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6141–6154
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.409
DOI:
10.18653/v1/2023.findings-emnlp.409
Bibkey:
Cite (ACL):
Dongyu Zhang, Jingwei Yu, Senyuan Jin, Liang Yang, and Hongfei Lin. 2023. MultiCMET: A Novel Chinese Benchmark for Understanding Multimodal Metaphor. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 6141–6154, Singapore. Association for Computational Linguistics.
Cite (Informal):
MultiCMET: A Novel Chinese Benchmark for Understanding Multimodal Metaphor (Zhang et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/2023.findings-emnlp.409.pdf