Confidence-Aware Reasoning: Optimizing Self-Guided Thinking Trajectories in Large Reasoning Models

Jiaxin Zhang

doi:10.18653/v1/2025.emnlp-industry.146

Confidence-Aware Reasoning: Optimizing Self-Guided Thinking Trajectories in Large Reasoning Models

Abstract

Chain-of-thought enables large reasoning models (LRMs) to reason through multi-step problems but often leads to unnecessarily long or redundant reasoning traces, a phenomenon known as overthinking. This results in inflated inference costs and potential degradation in answer quality. To address these challenges, we propose Confidence-Aware Reasoning (), an inference-time framework that optimizes reasoning trajectories by selectively pruning low-utility reasoning blocks and halting early when sufficient confidence has been achieved. is theoretically grounded in Bayesian optimal experimental design, treating each reasoning block as a sequential decision whose utility is approximated by its marginal contribution to reducing final answer uncertainty. We introduce a lightweight implementation that leverages token-level confidence to dynamically modulate reasoning depth without additional supervision. Evaluations on multiple benchmarks, including AMC, AIME, GPQA-Diamond, and MATH-500 show that improves answer accuracy by up to +13.3%, while reducing average reasoning length by 40%–50%. Our findings demonstrate that information-theoretic insights can effectively control self-guided reasoning and enable LRMs to “think just enough” at test time.

Anthology ID:: 2025.emnlp-industry.146
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2025
Address:: Suzhou (China)
Editors:: Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2081–2095
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.emnlp-industry.146/
DOI:: 10.18653/v1/2025.emnlp-industry.146
Bibkey:
Cite (ACL):: Jiaxin Zhang. 2025. Confidence-Aware Reasoning: Optimizing Self-Guided Thinking Trajectories in Large Reasoning Models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 2081–2095, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):: Confidence-Aware Reasoning: Optimizing Self-Guided Thinking Trajectories in Large Reasoning Models (Zhang, EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.emnlp-industry.146.pdf

PDF Cite Search Fix data