Aaron Xu


2025

pdf bib
MemeQA: Holistic Evaluation for Meme Understanding
Khoi P. N. Nguyen | Terrence Li | Derek Lou Zhou | Gabriel Xiong | Pranav Balu | Nandhan Alahari | Alan Huang | Tanush Chauhan | Harshavardhan Bala | Emre Guzelordu | Affan Kashfi | Aaron Xu | Suyesh Shrestha | Megan Vu | Jerry Wang | Vincent Ng
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Automated meme understanding requires systems to demonstrate fine-grained visual recognition, commonsense reasoning, and extensive cultural knowledge. However, existing benchmarks for meme understanding only concern narrow aspects of meme semantics. To fill this gap, we present MemeQA, a dataset of over 9,000 multiple-choice questions designed to holistically evaluate meme comprehension across seven cognitive aspects. Experiments show that state-of-the-art Large Multimodal Models perform much worse than humans on MemeQA. While fine-tuning improves their performance, they still make many errors on memes wherein proper understanding requires going beyond surface-level sentiment. Moreover, injecting “None of the above” into the available options makes the questions more challenging for the models. Our dataset is publicly available at https://github.com/npnkhoi/memeqa.