None of the Above, Less of the Right Parallel Patterns in Human and LLM Performance on Multi-Choice Questions Answering

Zhi Rui Tam; Cheng-Kuang Wu; Chieh-Yen Lin; Yun-Nung Chen

None of the Above, Less of the Right Parallel Patterns in Human and LLM Performance on Multi-Choice Questions Answering

Zhi Rui Tam, Cheng-Kuang Wu, Chieh-Yen Lin, Yun-Nung Chen

Abstract

Multiple-choice exam questions with “None of the above” (NA) options have been extensively studied in educational testing, in which existing research suggests that they better assess true knowledge. However, their impact on Large Language Models (LLMs) evaluation remains underexplored. Through systematic experiments with 28 LLMs on the MMLU benchmark, we examine how NA options affect model performance and confidence calibration. Our analysis reveals that NA options, when used as the correct answer, lead to a consistent 30-50% performance drop across models regardless of scale–suggesting that LLMs lack the meta-cognitive ability to systematically evaluate and reject all given options when none are correct. This degradation shows strong domain dependence, with minimal impact on mathematical reasoning (14.6% drop) but severe effects on tasks requiring uncertainty handling like business ethics (48.1% drop). Our results highlight important implications for benchmark design and raise questions about LLMs’ ability to handle uncertainty in real-world applications.

Anthology ID:: 2025.findings-acl.1031
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:: Findings | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 20112–20134
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1031/
DOI:
Bibkey:
Cite (ACL):: Zhi Rui Tam, Cheng-Kuang Wu, Chieh-Yen Lin, and Yun-Nung Chen. 2025. None of the Above, Less of the Right Parallel Patterns in Human and LLM Performance on Multi-Choice Questions Answering. In Findings of the Association for Computational Linguistics: ACL 2025, pages 20112–20134, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: None of the Above, Less of the Right Parallel Patterns in Human and LLM Performance on Multi-Choice Questions Answering (Tam et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1031.pdf

PDF Cite Search Fix data