Arabic Humor as a Diagnostic Probe for Large Language Models

Wajdi Zaghouani

Arabic Humor as a Diagnostic Probe for Large Language Models

Abstract

Arabic humor provides a challenging diagnostic test for large language models because interpreting jokes often requires pragmatic inference, sociolinguistic awareness, and culturally grounded knowledge that standard NLP benchmarks do not evaluate. Arabic is particularly suitable for probing these abilities given its diglossic structure and dialect diversity, where humor frequently arises from register contrast, dialect-specific vocabulary, and shared cultural references. We propose a three-layer taxonomy of Arabic humor mechanisms covering pragmatic, semantic, and sociolinguistic phenomena, illustrated through thirteen curated examples spanning Egyptian, Levantine, Gulf, Tunisian, and Iraqi Arabic. Building on this taxonomy, we introduce a diagnostic evaluation framework using contrastive minimal pairs, a multi-dimensional scoring rubric, and a cultural presupposition ontology. A small proof-of-concept probing study with GPT-4o, Gemini 2.0 Flash, and Claude Sonnet 4.5 reveals recurring failure patterns in sarcasm interpretation, register contrast reasoning, dialectal vocabulary coverage, and cultural grounding. We position this work as a diagnostic framework and pilot, not a mature benchmark, and outline a path toward larger annotated resources.

Anthology ID:: 2026.chum-1.3
Volume:: Proceedings of the 2nd Workshop on Computational Humor (CHum 2026)
Month:: July
Year:: 2026
Address:: Online
Editors:: Ori Amir, Christian F. Hempelmann, Julia Rayz, Tiansi Dong, Tristan Miller
Venues:: chum | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 39–50
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.chum-1.3/
DOI:
Bibkey:
Cite (ACL):: Wajdi Zaghouani. 2026. Arabic Humor as a Diagnostic Probe for Large Language Models. In Proceedings of the 2nd Workshop on Computational Humor (CHum 2026), pages 39–50, Online. Association for Computational Linguistics.
Cite (Informal):: Arabic Humor as a Diagnostic Probe for Large Language Models (Zaghouani, chum 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.chum-1.3.pdf

PDF Cite Search Fix data