Bootstrapping Code Translation with Weighted Multilanguage Exploration

Yuhan Wu, Huan Zhang, Wei Cheng, Chen Shen, Jingyue Yang, Wei Hu


Abstract
Code translation across multiple programming languages is essential yet challenging due to two vital obstacles: scarcity of parallel data paired with executable test oracles, and optimization imbalance when handling diverse language pairs. We propose BootTrans, a bootstrapping method that resolves both obstacles. Its key idea is to leverage the functional invariance and cross-lingual portability of test suites, adapting abundant pivot-language unit tests to serve as universal verification oracles for multilingual reinforcement learning (RL) training. Our method introduces a dual-pool architecture with seed and exploration pools to progressively expand training data via execution-guided experience collection.Furthermore, we design a language-aware weighting mechanism that dynamically prioritizes harder translation directions based on relative performance across sibling languages, mitigating optimization imbalance. Extensive experiments on the HumanEval-X and TransCoder-Test benchmarks demonstrate substantial improvements over baseline LLMs across all translation directions, with ablation studies validating the effectiveness of both bootstrapping and weighting components.
Anthology ID:
2026.acl-long.1678
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
36247–36259
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1678/
DOI:
Bibkey:
Cite (ACL):
Yuhan Wu, Huan Zhang, Wei Cheng, Chen Shen, Jingyue Yang, and Wei Hu. 2026. Bootstrapping Code Translation with Weighted Multilanguage Exploration. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 36247–36259, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Bootstrapping Code Translation with Weighted Multilanguage Exploration (Wu et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1678.pdf
Checklist:
 2026.acl-long.1678.checklist.pdf