DeepPrune: Parallel Scaling without Inter-trace Redundancy

Shangqing Tu; Yaxuan Li; Yushi Bai; Lei Hou; Juanzi Li

DeepPrune: Parallel Scaling without Inter-trace Redundancy

Shangqing Tu, Yaxuan Li, Yushi Bai, Lei Hou, Juanzi Li

Abstract

Parallel scaling has emerged as a powerful paradigm to enhance reasoning capabilities in large language models (LLMs) by generating multiple Chain-of-Thought (CoT) traces simultaneously. However, this approach introduces significant computational inefficiency due to *inter-trace redundancy*—our analysis reveals that over 80% of parallel reasoning traces yield identical final answers, representing substantial wasted computation. To address this critical efficiency bottleneck, we propose **DeepPrune**, a novel framework that enables efficient parallel scaling through dynamic pruning. Our method features a specialized judge model trained with oversampling techniques to accurately predict answer equivalence from partial reasoning traces, achieving 0.7072 AUROC on equivalence prediction across unseen reasoning models. This is combined with an online greedy clustering algorithm that dynamically prunes redundant paths while preserving answer diversity. Comprehensive evaluations across three challenging benchmarks (AIME 2024, AIME 2025, and GPQA) and multiple reasoning models demonstrate that DeepPrune achieves remarkable token reduction ranging from 65.73% to 88.50% compared to conventional consensus sampling, while maintaining competitive accuracy within 3.4 percentage points. Our work establishes a new standard for efficient parallel reasoning, making high-performance reasoning more efficient. Our code and data are here: https://github.com/THU-KEG/DeepPrune/

Anthology ID:: 2026.findings-acl.656
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13389–13403
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.656/
DOI:
Bibkey:
Cite (ACL):: Shangqing Tu, Yaxuan Li, Yushi Bai, Lei Hou, and Juanzi Li. 2026. DeepPrune: Parallel Scaling without Inter-trace Redundancy. In Findings of the Association for Computational Linguistics: ACL 2026, pages 13389–13403, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: DeepPrune: Parallel Scaling without Inter-trace Redundancy (Tu et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.656.pdf
Checklist:: 2026.findings-acl.656.checklist.pdf

PDF Cite Search Checklist Fix data