Think-Search-Patch: A Retrieval-Augmented Reasoning Framework for Repository-Level Code Repair
Bojian Xiong, Yikun Lei, Xikai Liu, Shaowei Zhang, Pengyun Zhu, Yan Liu, Yongqi Leng, Ling Shi, Meizhi Zhong, Yurong Zhang, Yan Gao, Yiwu, Yao Hu, Deyi Xiong
Abstract
Large language models usually suffer from multiple-file coding scenarios where strong inter-file dependencies manifest, typically demonstrated in SWE-bench. To mitigate this issue, we propose Think-Search-Patch (TSP), a retrieval-augmented reasoning framework for repository-level code repair. At the Think stage, our system breaks down a coding task and creates clear search query. Next, at the Search stage, it retrieves relevant code snippets using models like E5. At the final Patch stage, it generates standardized patches based on the key snippets. In addition the proposed framework, we enhance system reliability through a two-stage training process. At the first stage, the system undergoes supervised fine-tuning (SFT) on our TSP dataset. At the subsequent stage, we employ rejection sampling with correction to generate preference pairs for Direct Preference Optimization (DPO) training, thereby reducing errors in the intermediate phases. Experimental results demonstrate that TSP framework enhances retrieval accuracy and repair success on SWE-bench Lite, even surpassing models with a larger size in managing extensive code contexts and successfully addressing bugs spanning across multiple files. All data and code available at https://github.com/Gengar0215/TSP-framework.- Anthology ID:
- 2025.emnlp-industry.109
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou (China)
- Editors:
- Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1555–1566
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.109/
- DOI:
- Cite (ACL):
- Bojian Xiong, Yikun Lei, Xikai Liu, Shaowei Zhang, Pengyun Zhu, Yan Liu, Yongqi Leng, Ling Shi, Meizhi Zhong, Yurong Zhang, Yan Gao, Yiwu, Yao Hu, and Deyi Xiong. 2025. Think-Search-Patch: A Retrieval-Augmented Reasoning Framework for Repository-Level Code Repair. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 1555–1566, Suzhou (China). Association for Computational Linguistics.
- Cite (Informal):
- Think-Search-Patch: A Retrieval-Augmented Reasoning Framework for Repository-Level Code Repair (Xiong et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.109.pdf