ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization

Zhensheng Jin; Xinze Li; Yifan Ji; Chunyi Peng; Zhenghao Liu (刘正皓); Qi Shi; Yukun Yan (闫宇坤); Shuo Wang (王硕); Furong Peng; Ge Yu (于戈)

doi:10.18653/v1/2025.findings-emnlp.770

ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization

Zhensheng Jin, Xinze Li, Yifan Ji, Chunyi Peng, Zhenghao Liu, Qi Shi, Yukun Yan, Shuo Wang, Furong Peng, Ge Yu

Abstract

Recent advances in Chain-of-Thought (CoT) prompting have substantially improved the reasoning capabilities of Large Language Models (LLMs). However, these methods often suffer from overthinking, leading to unnecessarily lengthy or redundant reasoning traces. Existing approaches attempt to mitigate this issue through curating multiple reasoning chains for training LLMs, but their effectiveness is often constrained by the quality of the generated data and prone to overfitting. To address the challenge, we propose Reasoning Compression Through Stepwise Trials (ReCUT), a novel method aimed at balancing the accuracy and length of reasoning trajectory. Specifically, ReCUT employs a stepwise exploration mechanism and a long-short switched sampling strategy, enabling LLMs to incrementally generate diverse reasoning paths. These paths are evaluated and used to construct preference pairs to train two specialized models (Gemini LLMs)—one optimized for reasoning accuracy, the other for shorter reasoning. A final integrated model is obtained by interpolating the parameters of these two models. Experimental results across multiple math reasoning datasets and backbone models demonstrate that ReCUT significantly reduces reasoning lengths by approximately 30-50%, while maintaining or improving reasoning accuracy compared to various baselines. All codes and data will be released via https://github.com/NEUIR/ReCUT.

Anthology ID:: 2025.findings-emnlp.770
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14269–14282
Language:
URL:: https://preview.aclanthology.org/ingest-luhme/2025.findings-emnlp.770/
DOI:: 10.18653/v1/2025.findings-emnlp.770
Bibkey:
Cite (ACL):: Zhensheng Jin, Xinze Li, Yifan Ji, Chunyi Peng, Zhenghao Liu, Qi Shi, Yukun Yan, Shuo Wang, Furong Peng, and Ge Yu. 2025. ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 14269–14282, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization (Jin et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-luhme/2025.findings-emnlp.770.pdf
Checklist:: 2025.findings-emnlp.770.checklist.pdf

PDF Cite Search Checklist Fix data