Abdellah Ghassel
2026
Entropy-Gated Branching for Efficient Test-Time Reasoning
Xianzhi Li | Ethan Callanan | Abdellah Ghassel | Xiaodan Zhu
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Xianzhi Li | Ethan Callanan | Abdellah Ghassel | Xiaodan Zhu
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Test-time compute methods can significantly improve the reasoning capabilities and problem-solving accuracy of large language models. However, these approaches require substantially more computational resources, with most computation wasted on exploring low-diversity branches where the model already exhibits high confidence. We observe that a small subset of uncertain reasoning steps has a disproportionately large impact on final prediction accuracy, and branching at these points tends to yield higher-quality and more diverse candidate reasoning steps. Therefore, we introduce Entropy-Gated Branching: a novel inference technique that dynamically allocates computational resources by selectively expanding prediction sequences only at points of high uncertainty. Our method leverages entropy as a gating mechanism to identify when branching is most beneficial, coupled with an external feedback model to rank and prune candidate branches. Empirical results on mathematical and financial reasoning benchmarks show that this strategy improves accuracy by 22.6% over standard inference while operating 31%-75% faster across math benchmarks than test-time beam search with higher performance. Our results show that dynamic resource allocation during inference can substantially improve both efficiency and effectiveness, offering a more scalable pathway to enhanced LLM reasoning capabilities. We release our code and tools here[<https://github.com/JXL884/entropy_gated_branching>]