SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative Exploration

Junhan Shi; Yijia Zhu; Zhenning Shi; Dan Zhao; Qing Li; Yong Jiang

doi:10.18653/v1/2025.findings-emnlp.1326

SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative Exploration

Junhan Shi, Yijia Zhu, Zhenning Shi, Dan Zhao, Qing Li, Yong Jiang

Abstract

Large Reasoning Models (LRMs) demonstrate strong performance on complex tasks through chain-of-thought (CoT) reasoning. However, they suffer from high inference latency due to lengthy reasoning chains. In this paper, we propose SpecCoT, a collaborative framework that combines large and small models for effective yet efficient reasoning. Unlike traditional speculative decoding, which operates at the token level, SpecCoT adopts a step-level verification strategy: the large model first establishes the reasoning direction, and for each intermediate step, the small model generates multiple candidate drafts in parallel. The large model then verifies these drafts, either selecting the most suitable one or rejecting them all and generating its own. SpecCoT approach balances reasoning quality with inference efficiency through fine-grained model cooperation. Experiments across diverse tasks show SpecCoT reduces inference latency by 1.7-4.1× while maintaining comparable accuracy to standard large model inference.

Anthology ID:: 2025.findings-emnlp.1326
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 24405–24415
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1326/
DOI:: 10.18653/v1/2025.findings-emnlp.1326
Bibkey:
Cite (ACL):: Junhan Shi, Yijia Zhu, Zhenning Shi, Dan Zhao, Qing Li, and Yong Jiang. 2025. SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative Exploration. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 24405–24415, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative Exploration (Shi et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1326.pdf
Checklist:: 2025.findings-emnlp.1326.checklist.pdf

PDF Cite Search Checklist Fix data