Jatin Agrawal

2026

j10official at SemEval-2026 Task 1: Neurosymbolic Humor Generation via GTVH-Guided LLM Decomposition
Jatin Agrawal | Radhika Mamidi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

We present a neurosymbolic pipeline for computational humor generation grounded in the General Theory of Verbal Humor. The system constructs the joke in five sequential stages: context analysis, humor architecture (identifying core incongruity), delivery strategy, content writing, and pairwise judging, orchestrated through the DSPy framework. The system generates four candidate jokes per input with independent humor strategies, then selects the best through knockout tournament-style evaluation. Despite using Gemma 3 27B, a model with roughly 20× fewer total parameters than frontier systems, our approach achieves competitive results across all five subtasks of SemEval- 2026 Task 1 (MWAHAHA), placing 2nd in two subtasks. We argue that these results demonstrate the viability of structured, theory-driven decomposition for solving complex tasks and that how a model reasons about humor is just as important as how large the model is.

pdf bib abs

Does Bigger Mean Funnier? Evaluating Humor Generation Across the Qwen3 Model Family
Jatin Agrawal | Radhika Mamidi
Proceedings of the 2nd Workshop on Computational Humor (CHum 2026)

We investigate whether scaling model parameters improves humor generation through a controlled ablation study. Using five Qwen3 variants (8B–235B, dense and MoE), we generate jokes across 50 themes. Beyond evaluating humor scaling, this work serves as an empirical study into the nature of LLM versus human evaluations on highly subjective creative tasks. While an automated judge yields a perfect monotonic ranking between parameter count and win rate, human annotators find no significant aggregate difference in humor quality. Restricting to themes where annotators agree reveals a significant preference for the largest model (p = 0.039), suggesting scaling effects exist but are masked by a "quality floor." Crucially, our analysis of bias characteristics shows that the automated judge exhibits severe positional and length biases compared to human evaluators, further suggesting that LLMs may systematically distort quality differences on subjective tasks.

2025

Early detection of disease outbreaks is crucial to ensure timely intervention by the health authorities. Due to the challenges associated with traditional indicator-based surveillance, monitoring informal sources such as online media has become increasingly popular. However, owing to the number of online articles getting published everyday, manual screening of the articles is impractical. To address this, we propose Health Sentinel. It is a multi-stage information extraction pipeline that uses a combination of ML and non-ML methods to extract events–structured information concerning disease outbreaks or other unusual health events–from online articles. The extracted events are made available to the Media Scanning and Verification Cell (MSVC) at the National Centre for Disease Control (NCDC), Delhi for analysis, interpretation and further dissemination to local agencies for timely intervention. From April 2022 till date, Health Sentinel has processed over 300 million news articles and identified over 95,000 unique health events across India of which over 3,500 events were shortlisted by the public health experts at NCDC as potential outbreaks.