Reason-Code: Reliable Code Generation via Test-Driven Monte Carlo Tree Search

Zixu Li; Zhiqi Peng

Reason-Code: Reliable Code Generation via Test-Driven Monte Carlo Tree Search

Abstract

Large Language Models (LLMs) are widely used for code generation, but their performance degrades on tasks requiring multi-step logical reasoning. In practice, reliability is often improved through multi-sample inference, but its cost grows linearly with the sample size, making it impractical under strict latency constraints. To address this, we propose Reason-Code, an inference-time framework that formulates code generation as a search process guided by execution feedback. It integrates Monte Carlo Tree Search (MCTS) with a lightweight execution sandbox, where candidate programs are evaluated via unit tests. To control inference cost, Reason-Code adopts a conditional budgeting strategy that activates search only when greedy generation fails. Compared with large-sample Best-of-N sampling, Reason-Code is designed to improve reliability without paying the full linear cost of additional sampling under strict latency budgets. Experiments on HumanEval and MBPP show that Reason-Code matches strong sampling baselines (e.g., Best-of-10) with lower token cost and no regression. Additional matched-budget analyses show that execution-guided adaptive inference improves over independent sampling/filtering baselines, while differences between UCB-guided search and simpler iterative repair are limited at low budget.

Anthology ID:: 2026.acl-industry.30
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Yunyao Li, Georg Rehm, Mei Tu
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 447–458
Language:
URL:: https://preview.aclanthology.org/ingestion-form-platform/2026.acl-industry.30/
DOI:
Bibkey:
Cite (ACL):: Zixu Li and Zhiqi Peng. 2026. Reason-Code: Reliable Code Generation via Test-Driven Monte Carlo Tree Search. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), pages 447–458, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Reason-Code: Reliable Code Generation via Test-Driven Monte Carlo Tree Search (Li & Peng, ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-form-platform/2026.acl-industry.30.pdf

PDF Cite Search Fix data