TRANS-ZERO: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data

Wei Zou; Sen Yang; Yu Bao; Shujian Huang (书剑 黄); Jiajun Chen; Shanbo Cheng

TRANS-ZERO: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data

Wei Zou, Sen Yang, Yu Bao, Shujian Huang, Jiajun Chen, Shanbo Cheng

Abstract

The rise of Large Language Models (LLMs) has reshaped machine translation (MT), but multilingual MT still relies heavily on parallel data for supervised fine-tuning (SFT), facing challenges like data scarcity for low-resource languages and catastrophic forgetting. To address these issues, we propose TRANS-ZERO, a self-play framework that leverages only monolingual data and the intrinsic multilingual knowledge of LLM. TRANS-ZERO combines Genetic Monte-Carlo Tree Search (G-MCTS) with preference optimization, achieving strong translation performance that rivals supervised methods. Experiments demonstrate that this approach not only matches the performance of models trained on large-scale parallel data but also excels in non-English translation directions. Further analysis reveals that G-MCTS itself significantly enhances translation quality by exploring semantically consistent candidates through iterative translations, providing a robust foundation for the framework’s success.

Anthology ID:: 2025.findings-acl.637
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:: Findings | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 12337–12347
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.637/
DOI:
Bibkey:
Cite (ACL):: Wei Zou, Sen Yang, Yu Bao, Shujian Huang, Jiajun Chen, and Shanbo Cheng. 2025. TRANS-ZERO: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data. In Findings of the Association for Computational Linguistics: ACL 2025, pages 12337–12347, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: TRANS-ZERO: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data (Zou et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.637.pdf

PDF Cite Search Fix data