START: Self-taught Reasoner with Tools
Chengpeng Li, Mingfeng Xue, Zhenru Zhang, Jiaxi Yang, Beichen Zhang, Bowen Yu, Binyuan Hui, Junyang Lin, Xiang Wang, Dayiheng Liu
Abstract
Large Reasoning Models (LRMs) have demonstrated remarkable capabilities in complex reasoning through long chain-of-thought, yet they struggle with precise computations and algorithmic operations. Integrating computational tools with LRMs remains challenging, particularly in activating and enhancing models’ tool-use capabilities without compromising their reasoning strengths. We address these challenges through START (Self-taught Reasoner with Tools), introducing two key innovations: (1) Hint-infer, a training-free approach that activates LRMs’ latent tool-use capabilities through artificial hints, enabling test-time performance scaling; (2) Hint-RFT, a self-training framework that enables models to learn effective tool utilization through diverse hint patterns and rejection-based data synthesis. Experiments show that START significantly improves state-of-the-art LRMs across challenging benchmarks, including competition-level mathematics (AMC23: 95.0%, AIME24: 75.6%) and graduate-level science questions (GPQA: 64.6%). Our analysis reveals that START not only enhances accuracy but also improves reasoning efficiency through strategic tool utilization, demonstrating broad applicability in complex reasoning scenarios.- Anthology ID:
- 2025.emnlp-main.683
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 13523–13564
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.683/
- DOI:
- Cite (ACL):
- Chengpeng Li, Mingfeng Xue, Zhenru Zhang, Jiaxi Yang, Beichen Zhang, Bowen Yu, Binyuan Hui, Junyang Lin, Xiang Wang, and Dayiheng Liu. 2025. START: Self-taught Reasoner with Tools. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 13523–13564, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- START: Self-taught Reasoner with Tools (Li et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.683.pdf