Sungmok Jung
2026
Thunder-NUBench: A Benchmark for LLMs’ Sentence-Level Negation Understanding
Yeonkyoung So | Gyuseong Lee | Sungmok Jung | Joonhak Lee | JiA Kang | Sangho Kim | Jaejin Lee
Findings of the Association for Computational Linguistics: EACL 2026
Yeonkyoung So | Gyuseong Lee | Sungmok Jung | Joonhak Lee | JiA Kang | Sangho Kim | Jaejin Lee
Findings of the Association for Computational Linguistics: EACL 2026
Negation is a fundamental linguistic phenomenon that poses ongoing challenges for Large Language Models (LLMs), particularly in tasks requiring deep semantic understanding. Current benchmarks often treat negation as a minor detail within broader tasks, such as natural language inference. Consequently, there is a lack of benchmarks specifically designed to evaluate comprehension of negation. In this work, we introduce *Thunder-NUBench* — a novel benchmark explicitly created to assess sentence-level understanding of negation in LLMs. Thunder-NUBench goes beyond identifying surface-level cues by contrasting standard negation with structurally diverse alternatives, such as local negation, contradiction, and paraphrase. This benchmark includes manually created sentence-negation pairs and a multiple-choice dataset, allowing for a comprehensive evaluation of models’ understanding of negation.