SemEval-2026 Task 1: MWAHAHA, Models Write Automatic Humor And Humans Annotate

Santiago Castro; Luis Chiruzzo; Santiago Góngora; Naihao Deng; Salar Rahili; Ignacio Sastre; Aiala Rosá; Victoria Amoroso; Guillermo Rey; Guillermo Moncecchi; J. A. Meaney; Juan José Prada; Rada Mihalcea

SemEval-2026 Task 1: MWAHAHA, Models Write Automatic Humor And Humans Annotate

Santiago Castro, Luis Chiruzzo, Santiago Góngora, Naihao Deng, Salar Rahili, Ignacio Sastre, Aiala Rosá, Victoria Amoroso, Guillermo Rey, Guillermo Moncecchi, J. A. Meaney, Juan José Prada, Rada Mihalcea

Abstract

We present SemEval-2026 Task 1: MWAHAHA (Models Write Automatic Humor And Humans Annotate), the first shared task on general-purpose humor generation. Systems must produce short jokes in English, Spanish, and Chinese under lexical or topical constraints (Subtask A) and generate humorous captions for GIFs (Subtask B). To discourage memorization and ensure fairness, all jokes must meet specific criteria, such as using infrequent word pairs or relating to recent news headlines. Evaluation is conducted through pairwise human preference judgments in a Chatbot Arena-style setting, yielding Elo-based rankings. The task attracted 309 registered users, with 37 teams submitting systems to the evaluation phase. Participating systems employ a wide range of NLP techniques, including generate-then-rank pipelines, reinforcement learning, parameter-efficient fine-tuning, retrieval-augmented generation, humor-theory-grounded prompting, and persona-based strategies. Our Gemini 2.5 Flash baseline, using simple prompts, tied for first place in all subtasks, and the majority of elaborate multi-stage pipelines only marginally surpassed it with overlapping confidence intervals. More work is necessary to outperform the simple usage of state-of-the-art large language models. We release all evaluation data, prompts, and leaderboard results to support future research in computational humor generation.

Anthology ID:: 2026.semeval-1.454
Volume:: Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3797–3822
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.454/
DOI:
Bibkey:
Cite (ACL):: Santiago Castro, Luis Chiruzzo, Santiago Góngora, Naihao Deng, Salar Rahili, Ignacio Sastre, Aiala Rosá, Victoria Amoroso, Guillermo Rey, Guillermo Moncecchi, J. A. Meaney, Juan José Prada, and Rada Mihalcea. 2026. SemEval-2026 Task 1: MWAHAHA, Models Write Automatic Humor And Humans Annotate. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3797–3822, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: SemEval-2026 Task 1: MWAHAHA, Models Write Automatic Humor And Humans Annotate (Castro et al., SemEval 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.454.pdf
Supplementarymaterial:: 2026.semeval-1.454.SupplementaryMaterial.zip
Supplementarymaterial:: 2026.semeval-1.454.SupplementaryMaterial.zip

PDF Cite Search Supplementarymaterial Supplementarymaterial Fix data