Thinking Out Loud: Do Reasoning Models Know When They’re Right?

Qingcheng Zeng; Weihao Xuan; Leyang Cui; Rob Voigt

Thinking Out Loud: Do Reasoning Models Know When They’re Right?

Qingcheng Zeng, Weihao Xuan, Leyang Cui, Rob Voigt

Abstract

Large reasoning models (LRMs) have recently demonstrated impressive capabilities in complex reasoning tasks by leveraging increased test-time computation and exhibiting behaviors reminiscent of human-like self-reflection. While LRMs show a clear capacity for valuable self-reflection, how this ability interacts with other model behaviors remains underexplored. We investigate this connection by analyzing verbalized confidence, how models articulate their certainty, as a lens into the nature of self-reflection in LRMs. We find that supervised fine-tuning on reasoning traces (i.e., distillation) and reinforcement learning can improve verbalized calibration in reasoning-intensive settings in a progressive, laddered fashion. However, our results also indicate that reasoning models may possess a diminished awareness of their own knowledge boundaries, as evidenced by significantly lower “I don’t know” response rates on factuality benchmarks. Moreover, we examine the relationship between verbalized confidence and reasoning chains, finding that models tend to express higher confidence when providing shorter or less elaborate reasoning. Our findings highlight how reasoning-oriented training can enhance performance in reasoning-centric tasks while potentially incurring a reasoning tax, a cost reflected in the model’s reduced ability to accurately recognize the limits of its own knowledge in small-scale models. More broadly, our work showcases how this erosion of knowledge boundaries can compromise model faithfulness, as models grow more confident without a commensurate understanding of when they should abstain.

Anthology ID:: 2025.emnlp-main.73
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1394–1407
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.73/
DOI:
Bibkey:
Cite (ACL):: Qingcheng Zeng, Weihao Xuan, Leyang Cui, and Rob Voigt. 2025. Thinking Out Loud: Do Reasoning Models Know When They’re Right?. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 1394–1407, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Thinking Out Loud: Do Reasoning Models Know When They’re Right? (Zeng et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.73.pdf
Checklist:: 2025.emnlp-main.73.checklist.pdf

PDF Cite Search Checklist Fix data