RAGthoven at SemEval-2026 Task 1: A Multi-Stage Pipeline Walks Into a Benchmark and Barely Clears the Bar
Marek Suppa, Viktória Ondrejová, Lucia Ganajová, Gregor Karetka, Daniel Skala
Abstract
We present \textsc{RAGthoven}, our system for SemEval-2026 Task~1 (MuWaHaHa), Subtask~A (multilingual constrained humor generation in English, Spanish, and Chinese).\textsc{RAGthoven} decomposes creative text generation into a multi-stage large language model (LLM) pipeline (\textit{Planner}, \textit{Writer}, \textit{Reflector}, \textit{Judge}) grounded in computational humor theories (Benign Violation Theory, Script-based Semantic Theory of Humor) and iteratively refined through prompt engineering across ten experiments.In our final configuration, we augment the Planner with retrieval-augmented generation (RAG) from a curated joke corpus, seeding generation with diverse joke mechanisms.We additionally explore an agentic variant that exposes the same four pipeline stages as tool-calling agents orchestrated by a model loop with a \textsc{ConstraintAudit} checker. While it achieves full constraint compliance, human pairwise evaluation did not reveal a significant quality advantage over the simpler non-agentic baseline.\textsc{RAGthoven} achieves Rank~1 in all three languages, with the strongest result in Spanish (Elo 1182, 42 points above the Gemini~2.5~Flash baseline).However, while the system leads in raw Elo in Spanish, it shares Rank~1 with the baseline in all three languages due to overlapping confidence intervals; in English and Chinese the gap narrows further, suggesting that elaborate multi-stage prompt engineering may offer diminishing returns once a strong frontier model is in the loop.- Anthology ID:
- 2026.semeval-1.416
- Volume:
- Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, USA
- Editors:
- Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3343–3356
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.416/
- DOI:
- Cite (ACL):
- Marek Suppa, Viktória Ondrejová, Lucia Ganajová, Gregor Karetka, and Daniel Skala. 2026. RAGthoven at SemEval-2026 Task 1: A Multi-Stage Pipeline Walks Into a Benchmark and Barely Clears the Bar. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3343–3356, San Diego, California, USA. Association for Computational Linguistics.
- Cite (Informal):
- RAGthoven at SemEval-2026 Task 1: A Multi-Stage Pipeline Walks Into a Benchmark and Barely Clears the Bar (Suppa et al., SemEval 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.416.pdf