CMBan: Cartoon-Driven Meme Contextual Classification Dataset for Bangla

Newaz Ben Alam, Akm Moshiur Rahman Mazumder, Mir Sazzat Hossain, Mysha Samiha, Md Alvi Noor Hossain, Md Fahim, Amin Ahsan Ali, Ashraful Islam, M Ashraful Amin, Akmmahbubur Rahman


Abstract
Social networks extensively feature memes, particularly cartoon images, as a prevalent form of communication often conveying complex sentiments or harmful content. Detecting such content, particularly when it involves Bengali and English text, remains a multimodal challenge. This paper introduces ***CMBan***, a novel and culturally relevant dataset of 2,641 annotated cartoon memes. It addresses meme classification based on their sentiment across five key categories: Humor, Sarcasm, Offensiveness, Motivational Content, and Overall Sentiment, incorporating both image and text features. Our curated dataset specifically aids in detecting nuanced offensive content and navigating complexities of pure Bengali, English, or code-mixed Bengali-English languages. Through rigorous experimentation involving over 12 multimodal models, including monolingual, multilingual, and proprietary architectures, and utilizing prompting methods like Chain-Of-Thought (CoT), findings suggest this cartoon-based, code-mixed meme content poses substantial understanding challenges. Experimental results demonstrate that closed models excel over open models. While the LoRA fine-tuning strategy equalizes performance across model architectures and improves classification of challenging aspects in multilingual meme contexts, this work advances meme classification by providing effective solution for detecting harmful content in multilingual meme contexts.
Anthology ID:
2025.findings-ijcnlp.135
Volume:
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
Venue:
Findings
SIG:
Publisher:
The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
Note:
Pages:
2178–2194
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.135/
DOI:
Bibkey:
Cite (ACL):
Newaz Ben Alam, Akm Moshiur Rahman Mazumder, Mir Sazzat Hossain, Mysha Samiha, Md Alvi Noor Hossain, Md Fahim, Amin Ahsan Ali, Ashraful Islam, M Ashraful Amin, and Akmmahbubur Rahman. 2025. CMBan: Cartoon-Driven Meme Contextual Classification Dataset for Bangla. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 2178–2194, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
Cite (Informal):
CMBan: Cartoon-Driven Meme Contextual Classification Dataset for Bangla (Alam et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.135.pdf