When Models Hesitate: Answer Instability as a Label-Free Uncertainty Signal for LLMs

Jasper Meynard Arana, Kristine Ann M. Carandang, Ethan Robert Casin, Christian Alis, Christopher Monterola


Abstract
Large language models (LLMs) are increasingly deployed in high-stakes settings, yet reliably estimating when their outputs should be trusted remains an open challenge. Existing uncertainty estimation approaches—such as calibration, token-level probabilities, or semantic entropy—typically require access to model internals, additional supervision, or computationally intensive pipelines. We propose answer instability, defined as the variability of a model’s final answer across repeated stochastic generations of the same prompt, as a simple, label-free, and black-box uncertainty signal. Evaluated across three task types — reasoning, multiple-choice QA, and constraint-following — using four LLMs and 520 prompt-model pairs, our approach achieves performance competitive with semantic entropy while requiring no semantic similarity model. Our results show that instability strongly correlates with prediction errors and reliably discriminates correct from incorrect outputs. We further demonstrate its utility for selective prediction and targeted repair, improving reliability without access to internal probabilities or additional training.
Anthology ID:
2026.acl-srw.72
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
816–826
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.72/
DOI:
Bibkey:
Cite (ACL):
Jasper Meynard Arana, Kristine Ann M. Carandang, Ethan Robert Casin, Christian Alis, and Christopher Monterola. 2026. When Models Hesitate: Answer Instability as a Label-Free Uncertainty Signal for LLMs. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 816–826, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
When Models Hesitate: Answer Instability as a Label-Free Uncertainty Signal for LLMs (Arana et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.72.pdf