Rishabh Sabharwal
2026
Does Reasoning Kill the Joke? Long-Context Humor Understanding in Hindi
Kaveri Anuranjana | Navya Shrivastava | Atharv Johar | Rishabh Sabharwal | Gautam Ranka | Aryan Lunawat | Punit Rathore | Radhika Mamidi
Proceedings of the 4th Workshop on Cross-Cultural Considerations in NLP (C3NLP 2026)
Kaveri Anuranjana | Navya Shrivastava | Atharv Johar | Rishabh Sabharwal | Gautam Ranka | Aryan Lunawat | Punit Rathore | Radhika Mamidi
Proceedings of the 4th Workshop on Cross-Cultural Considerations in NLP (C3NLP 2026)
Verbal humor involves reasoning through complex conversational contexts. Although LLMs have achieved strong performance on English humor datasets, their ability to interpret humor in Hindi remains unexplored. In this paper, we evaluate Hindi humor for which we extract dialogues from humorous video clips. We use a pipeline that transforms video content into detailed textual streams, including dialogue transcripts and scene descriptions, allowing reasoning over inputs exceeding 2,000 words. We test various LLMs, from efficient edge models (Qwen-2.5-7B, Qwen-3-7B, Gemma-3-27B) to Indic-focused models (Sarvam-M-24B) and large frontier models (Llama-3.1-70B, Gemini-2.0-Flash). Our findings show a concave performance pattern in long-context understanding, with reasoning quality peaking at moderate lengths (250–750 words) and declining at higher context lengths. We also show that standard metrics overstate pragmatic competence. While increasing model size generally improves performance, we also observe distinct failures in smaller LLMs due to instructional and linguistic issues, necessitating diversity metrics to capture hallucinations. Smaller, Hindi-focused models can compete with much larger generalist models. Importantly, our evaluation reveals that conversational humor is a challenge for even specialized models, making HinS a valuable benchmark for advancing research in Hindi Long-Context Humor Reasoning.