Zhijun Xu
2025
PunMemeCN: A Benchmark to Explore Vision-Language Models’ Understanding of Chinese Pun Memes
Zhijun Xu
|
Siyu Yuan
|
Yiqiao Zhang
|
Jingyu Sun
|
Tong Zheng
|
Deqing Yang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Pun memes, which combine wordplay with visual elements, represent a popular form of humor in Chinese online communications. Despite their prevalence, current Vision-Language Models (VLMs) lack systematic evaluation in understanding and applying these culturally-specific multimodal expressions. In this paper, we introduce PunMemeCN, a novel benchmark designed to assess VLMs’ capabilities in processing Chinese pun memes across three progressive tasks: pun meme detection, sentiment analysis, and chat-driven meme response. PunMemeCN consists of 1,959 Chinese memes (653 pun memes and 1,306 non-pun memes) with comprehensive annotations of punchlines, sentiments, and explanations, alongside 2,008 multi-turn chat conversations incorporating these memes. Our experiments indicate that state-of-the-art VLMs struggle with Chinese pun memes, particularly with homophone wordplay, even with Chain-of-Thought prompting. Notably, punchlines in memes can effectively conceal potentially harmful content from AI detection. These findings underscore the challenges in cross-cultural multimodal understanding and highlight the need for culture-specific approaches to humor comprehension in AI systems.
2024
“A good pun is its own reword”: Can Large Language Models Understand Puns?
Zhijun Xu
|
Siyu Yuan
|
Lingjie Chen
|
Deqing Yang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Puns play a vital role in academic research due to their distinct structure and clear definition, which aid in the comprehensive analysis of linguistic humor. However, the understanding of puns in large language models (LLMs) has not been thoroughly examined, limiting their use in creative writing and humor creation. In this paper, we leverage three popular tasks, i.e., pun recognition, explanation and generation to systematically evaluate the capabilities of LLMs in pun understanding. In addition to adopting the automated evaluation metrics from prior research, we introduce new evaluation methods and metrics that are better suited to the in-context learning paradigm of LLMs. These new metrics offer a more rigorous assessment of an LLM’s ability to understand puns and align more closely with human cognition than previous metrics. Our findings reveal the “lazy pun generation” pattern and identify the primary challenges LLMs encounter in understanding puns.
Search
Fix author
Co-authors
- Deqing Yang 2
- Siyu Yuan 2
- Lingjie Chen 1
- Jingyu Sun 1
- Yiqiao Zhang 1
- show all...