FigMemes: A Dataset for Figurative Language Identification in Politically-Opinionated Memes

Chen Liu, Gregor Geigle, Robin Krebs, Iryna Gurevych


Abstract
Real-world politically-opinionated memes often rely on figurative language to cloak propaganda and radical ideas to help them spread. It is not only a scientific challenge to develop machine learning models to recognize them in memes, but also sociologically beneficial to understand hidden meanings at scale and raise awareness. These memes are fast-evolving (in both topics and visuals) and it remains unclear whether current multimodal machine learning models are robust to such distribution shifts. To enable future research into this area, we first present FigMemes, a dataset for figurative language classification in politically-opinionated memes. We evaluate the performance of state-of-the-art unimodal and multimodal models and provide comprehensive benchmark results. The key contributions of this proposed dataset include annotations of six commonly used types of figurative language in politically-opinionated memes, and a wide range of topics and visual styles.We also provide analyses on the ability of multimodal models to generalize across distribution shifts in memes. Our dataset poses unique machine learning challenges and our results show that current models have significant room for improvement in both performance and robustness to distribution shifts.
Anthology ID:
2022.emnlp-main.476
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7069–7086
Language:
URL:
https://aclanthology.org/2022.emnlp-main.476
DOI:
10.18653/v1/2022.emnlp-main.476
Bibkey:
Cite (ACL):
Chen Liu, Gregor Geigle, Robin Krebs, and Iryna Gurevych. 2022. FigMemes: A Dataset for Figurative Language Identification in Politically-Opinionated Memes. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7069–7086, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
FigMemes: A Dataset for Figurative Language Identification in Politically-Opinionated Memes (Liu et al., EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2022.emnlp-main.476.pdf