ERRV: Eliciting Efficient Reasoning through Reasoning Vectors for Policy Optimization in Large Language Models
Zhuowen Han, Lei Yang, Renren Jin, Dan Shi, Chenxi Sun, Deyi Xiong
Abstract
Recently, large reasoning models have achieved impressive performance, but their lengthy reasoning processes incur substantial inference overhead. To mitigate this issue, we propose the concept of reasoning vectors, representations extracted from the model’s hidden states, which can guide the model towards generating more concise and accurate responses. Building upon this, we present ERRV, a training framework that elicits efficient reasoning through reasoning vectors, which enables the model to generate high-quality responses during reinforcement learning. By performing targeted policy optimization on both accuracy and length objectives, ERRV effectively activates the model’s latent capability for efficient reasoning. Our experiments demonstrate that after training with ERRV, the model achieves approximately 30% reduction in reasoning length while maintaining stable accuracy, without guidance from the reasoning vector during inference. This establishes a trade-off between efficiency and performance. Furthermore, we identify key properties of reasoning vectors: robustness, characterized by high similarity before and after training, and generalizability, demonstrating applicability across base models, distilled models, RL-trained models, parameter-merged models, and mixed-thought models. These properties collectively guarantee the reliability and broad applicability of our approach.- Anthology ID:
- 2026.findings-acl.1425
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 28557–28570
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1425/
- DOI:
- Cite (ACL):
- Zhuowen Han, Lei Yang, Renren Jin, Dan Shi, Chenxi Sun, and Deyi Xiong. 2026. ERRV: Eliciting Efficient Reasoning through Reasoning Vectors for Policy Optimization in Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 28557–28570, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- ERRV: Eliciting Efficient Reasoning through Reasoning Vectors for Policy Optimization in Large Language Models (Han et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1425.pdf