ABARUAH at SemEval-2026 Task 1: Leveraging High-Resolution VLMs and Reasoning LLMs for Multimodal Humor Generation

Arup Baruah


Abstract
This paper describes the systems developed for "SemEval 2026 Task 1: Humor Generation". This shared task covered both unimodal text constraints and multimodal GIF-based humor generation. The proposed approach used a two-stage pipeline consisting of a Multimodal Grounding stage to extract semantic descriptions from GIFs and a Humor Synthesis stage to generate the final humorous output. The Qwen2-VL and Qwen3-8B models were used for these respective stages. The system achieved competitive Elo-like ratings of 1009, 973, and 914 for Subtasks A, B1, and B2, respectively, demonstrating its ability to address diverse humorous constraints. The system was ranked 4th in overall standings for Subtasks A and B1.
Anthology ID:
2026.semeval-1.436
Volume:
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3536–3543
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.436/
DOI:
Bibkey:
Cite (ACL):
Arup Baruah. 2026. ABARUAH at SemEval-2026 Task 1: Leveraging High-Resolution VLMs and Reasoning LLMs for Multimodal Humor Generation. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3536–3543, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
ABARUAH at SemEval-2026 Task 1: Leveraging High-Resolution VLMs and Reasoning LLMs for Multimodal Humor Generation (Baruah, SemEval 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.436.pdf
Supplementarymaterial:
 2026.semeval-1.436.SupplementaryMaterial.zip