MemeInterpret: Towards an All-in-One Dataset for Meme Understanding

Jeongsik Park, Khoi P. N. Nguyen, Jihyung Park, Minseok Kim, Jaeheon Lee, Jae Won Choi, Kalyani Ganta, Phalgun Ashrit Kasu, Rohan Sarakinti, Sanjana Vipperla, Sai Sathanapalli, Nishan Vaghani, Vincent Ng


Abstract
Meme captioning, the task of generating a sentence that describes the meaning of a meme, is both challenging and important in advancing Computational Meme Understanding (CMU). However, existing research has not explored its decomposition into subtasks or its connections to other CMU tasks. To address this gap, we introduce MemeInterpret, a meme corpus containing meme captions together with corresponding surface messages and relevant background knowledge. Strategically built upon the Facebook Hateful Memes dataset, MemeInterpret is the last piece in a set of corpora that unifies three major categories of CMU tasks for the first time. Extensive experiments on MemeInterpret and connected datasets suggest strong relationships between meme captioning, its two proposed subtasks, and the other two key categories of CMU tasks: classification and explanation. To stimulate further research on CMU, we make our dataset publicly available at https://github.com/npnkhoi/MemeInterpret.
Anthology ID:
2025.findings-emnlp.871
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16073–16087
Language:
URL:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.871/
DOI:
10.18653/v1/2025.findings-emnlp.871
Bibkey:
Cite (ACL):
Jeongsik Park, Khoi P. N. Nguyen, Jihyung Park, Minseok Kim, Jaeheon Lee, Jae Won Choi, Kalyani Ganta, Phalgun Ashrit Kasu, Rohan Sarakinti, Sanjana Vipperla, Sai Sathanapalli, Nishan Vaghani, and Vincent Ng. 2025. MemeInterpret: Towards an All-in-One Dataset for Meme Understanding. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 16073–16087, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
MemeInterpret: Towards an All-in-One Dataset for Meme Understanding (Park et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.871.pdf
Checklist:
 2025.findings-emnlp.871.checklist.pdf