Reason-Code: Reliable Code Generation via Test-Driven Monte Carlo Tree Search

Zixu Li, Zhiqi Peng


Abstract
Large Language Models (LLMs) are widely used for code generation, but their performance degrades on tasks requiring multi-step logical reasoning. In practice, reliability is often improved through multi-sample inference, but its cost grows linearly with the sample size, making it impractical under strict latency constraints. To address this, we propose Reason-Code, an inference-time framework that formulates code generation as a search process guided by execution feedback. It integrates Monte Carlo Tree Search (MCTS) with a lightweight execution sandbox, where candidate programs are evaluated via unit tests. To control inference cost, Reason-Code adopts a conditional budgeting strategy that activates search only when greedy generation fails. Compared with large-sample Best-of-N sampling, Reason-Code is designed to improve reliability without paying the full linear cost of additional sampling under strict latency budgets. Experiments on HumanEval and MBPP show that Reason-Code matches strong sampling baselines (e.g., Best-of-10) with lower token cost and no regression. Additional matched-budget analyses show that execution-guided adaptive inference improves over independent sampling/filtering baselines, while differences between UCB-guided search and simpler iterative repair are limited at low budget.
Anthology ID:
2026.acl-industry.30
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Yunyao Li, Georg Rehm, Mei Tu
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
447–458
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-industry.30/
DOI:
Bibkey:
Cite (ACL):
Zixu Li and Zhiqi Peng. 2026. Reason-Code: Reliable Code Generation via Test-Driven Monte Carlo Tree Search. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 447–458, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Reason-Code: Reliable Code Generation via Test-Driven Monte Carlo Tree Search (Li & Peng, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-industry.30.pdf