Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem

Yubo Wang, Ping Nie, Kai Zou, Lijun Wu, Wenhu Chen


Abstract
Critique Fine-Tuning (CFT) has recently emerged as a promising paradigm for unlocking the reasoning capabilities of large language models (LLMs). In this work, we introduce one-shot CFT, a highly compute-efficient approach that leverages critique data generated from a single math problem. Remarkably, this method yields significant gains in reasoning accuracy, surpassing one-shot RLVR (Reinforcement Learning with Verifiable Reward) while requiring 15 to 20 times less compute. Given one math problem, we first prompt a set of diverse small models to produce candidate solutions, then use frontier models such as GPT-4.1 to generate high-quality critiques of these responses. We fine-tune Qwen and Llama family models ranging from 1.5B to 14B parameters with CFT. With just 5 GPU hours, our models achieve up to a 16 percent absolute improvement in average accuracy across six mathematical reasoning benchmarks (for example, Qwen2.5-Math-7B improves from 26 percent to 42 percent). Furthermore, ablation studies reveal the robustness of one-shot CFT across different prompt problems. Our findings suggest an extremely compute-efficient approach to unleash the reasoning potential of LLMs.
Anthology ID:
2025.emnlp-main.149
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3017–3027
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.149/
DOI:
Bibkey:
Cite (ACL):
Yubo Wang, Ping Nie, Kai Zou, Lijun Wu, and Wenhu Chen. 2025. Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 3017–3027, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem (Wang et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.149.pdf
Checklist:
 2025.emnlp-main.149.checklist.pdf