Xiaojian Zhong
2026
SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution
Zhenyu He | Qingping Yang | Wei Shen | Xiaojian Zhong | Kechi Zhang | Chenxin An | Wenlei Shi | Tianle Cai | Di He | Jiaze Chen | Jingjing Xu
Findings of the Association for Computational Linguistics: ACL 2026
Zhenyu He | Qingping Yang | Wei Shen | Xiaojian Zhong | Kechi Zhang | Chenxin An | Wenlei Shi | Tianle Cai | Di He | Jiaze Chen | Jingjing Xu
Findings of the Association for Computational Linguistics: ACL 2026
Automated software engineering, particularly resolving real-world issues on benchmarks like SWE-bench, remains a significant challenge for Large Language Models (LLMs). To address this, we introduce SWE-Swiss, a two-phase training recipe that systematically develops these capabilities. Our approach first decomposes issue resolution into three core skills: Localization, Repair, and Unit Test Generation. In the first phase, we perform multi-task Supervised Fine-Tuning (SFT) on three new, meticulously curated datasets to build a versatile foundation. The second phase applies targeted Reinforcement Learning (RL), using direct feedback from test execution to boost the critical skill of code repair. The resulting model, SWE-Swiss-32B, establishes a new state-of-the-art for open-source models in its size class, achieving a 60.2% score on the SWE-bench Verified benchmark and placing it in the same top-tier performance bracket as much larger models. Finally, we show that despite its specialized training, SWE-Swiss-32B demonstrates strong generalization to other common LLM benchmarks. To accelerate research in the community, we are open-sourcing the models and our complete training datasets.