Priyanshu Mahato

2026

LLMs in Sarcasm Detection? It’s elementary! (Or is it?)
Priyanshu Mahato | Aniket Santosh Mishra | Kripabandhu Ghosh
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

While Large Language Models (LLMs) are frequently cited for their sophisticated pragmatic reasoning (CITATION), recent progress in sarcasm detection increasingly relies on synthetic benchmarks (CITATION). This study exposes a catastrophic generalization gap in this paradigm: we observe that models achieve near-perfect accuracy on synthetic data but collapse to random guessing on organic human speech. By triangulating hidden state geometry, entropy analysis, and causal interventions, we demonstrate that this disparity stems from shortcut learning (CITATION)—models exploit the low-entropy statistical signatures of generated text while remaining “semantically blind” to the pragmatic cues essential for irony. Our findings indicate that high performance on synthetic leaderboards reflects forensic pattern matching rather than the genuine linguistic intelligence assumed in prior work, creating a statistical mirage of competence.

Co-authors

Venues

ACL1

Fix author