Confidence-Weighted Token Set Cover for Early Hypothesis Pruning in Self-Consistency

Md Arafat Sultan, Ram\'on Fernandez Astudillo


Abstract
Despite its simplicity and efficacy, the high token expenditure of self-consistency can limit its practical utility. We investigate whether early hypothesis pruning can improve the token efficiency of self-consistency for long chain-of-thought reasoning tasks, while preserving its parallelism. Concretely, we generate all solutions in parallel but periodically prune intermediate hypotheses based on two lightweight indicators: (a) the model’s confidence in each hypothesis, and (b) the lexical coverage of all current hypotheses by candidate subsets. We design a fast weighted set cover algorithm that utilizes the two indicators; evaluation of five LLMs on three math benchmarks shows that our method improves token efficiency in most cases, with reductions of 10-35% in many.
Anthology ID:
2026.findings-acl.2046
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
41148–41155
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2046/
DOI:
Bibkey:
Cite (ACL):
Md Arafat Sultan and Ram\'on Fernandez Astudillo. 2026. Confidence-Weighted Token Set Cover for Early Hypothesis Pruning in Self-Consistency. In Findings of the Association for Computational Linguistics: ACL 2026, pages 41148–41155, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Confidence-Weighted Token Set Cover for Early Hypothesis Pruning in Self-Consistency (Sultan & Astudillo, Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2046.pdf
Checklist:
 2026.findings-acl.2046.checklist.pdf