Speculative Decoding for Multi-Sample Inference
Yiwei Li, Jiayi Shi, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Ji Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li
Abstract
We propose a novel speculative decoding method tailored for multi-sample reasoning scenarios, such as self-consistency and Best-of-N sampling. Our method exploits the intrinsic consensus of parallel generation paths to synthesize high-quality draft tokens without requiring auxiliary models or external databases. By dynamically analyzing structural patterns across parallel reasoning paths through a probabilistic aggregation mechanism, it identifies consensus token sequences that align with the decoding distribution. Evaluations on mathematical reasoning and code generation benchmarks demonstrate a substantial improvement in draft acceptance rates over baselines, while reducing the latency in draft token construction. This work establishes a paradigm shift for efficient multi-sample inference, enabling seamless integration of speculative decoding with sampling-based reasoning techniques.- Anthology ID:
- 2025.findings-emnlp.668
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 12523–12533
- Language:
- URL:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.668/
- DOI:
- 10.18653/v1/2025.findings-emnlp.668
- Cite (ACL):
- Yiwei Li, Jiayi Shi, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Ji Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, and Kan Li. 2025. Speculative Decoding for Multi-Sample Inference. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 12523–12533, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Speculative Decoding for Multi-Sample Inference (Li et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.668.pdf