Benoit Dumoulin
2026
A Reward-Guided Dual-Phase Framework for Adaptive Inference-Time Reasoning
Yingqian Cui | Zhenwei Dai | Pengfei He | Bing He | Hui Liu | Zhan Shi | Xianfeng Tang | Jingying Zeng | Suhang Wang | Yue Xing | Jiliang Tang | Benoit Dumoulin
Findings of the Association for Computational Linguistics: ACL 2026
Yingqian Cui | Zhenwei Dai | Pengfei He | Bing He | Hui Liu | Zhan Shi | Xianfeng Tang | Jingying Zeng | Suhang Wang | Yue Xing | Jiliang Tang | Benoit Dumoulin
Findings of the Association for Computational Linguistics: ACL 2026
Large Language Models (LLMs) have made strong progress in reasoning. To enhance the reasoning performance, a common inference-time approach is tree-based search, which decomposes the reasoning process into multiple steps, expands multiple reasoning paths, and uses reward models to prune and select candidates. However, based on our exploration, the simple decomposition may lead to suboptimal searching efficiency: while planning is generally harder, it is the execution errors that are more likely to propagate to later steps. This indicates that planning and execution play different roles in reasoning and should be treated differently during tree-based search. Given this, to enhance the searching efficiency, we propose a dual-phase test-time scaling framework that separates reasoning into planning and execution, and performs search over each phase independently. To further refine the algorithm, we also introduce a dynamic budget allocation mechanism that adaptively redistributes sampling effort based on reward feedback, allowing early stopping on confident steps and reallocation of computation to more challenging steps. Experiments on both math reasoning and code generation benchmarks demonstrate that our approach consistently improves accuracy while reducing redundant computation.
2010
Search with Synonyms: Problems and Solutions
Xing Wei | Fuchun Peng | Huishin Tseng | Yumao Lu | Xuerui Wang | Benoit Dumoulin
Coling 2010: Posters
Xing Wei | Fuchun Peng | Huishin Tseng | Yumao Lu | Xuerui Wang | Benoit Dumoulin
Coling 2010: Posters