Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models

Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Ben He, Le Sun, Jingren Zhou, Junyang Lin


Abstract
The key to effective alignment lies in high-quality preference data. Recent research has focused on automated alignment, which involves developing alignment systems with minimal human intervention. However, prior research has predominantly focused on developing data generation methods, while insufficient attention has been paid to quality control mechanisms and often produces inaccurate and unhelpful data, leading to unpredictable benefits during iterative optimization. In this paper, we present Self-Steering Optimization (SSO), an algorithm that autonomously generates high-quality preference data, eliminating manual annotation requirements. SSO employs a specialized optimization objective to build a data generator from the policy model itself, which is used to produce accurate and on-policy data. We demonstrate SSO‘s effectiveness through comprehensive experiments on two series of models: Llama 3 and Qwen 2. Our evaluation across diverse benchmarks shows that SSO consistently outperforms baselines in human preference alignment and reward optimization. Further analysis validates SSO as a scalable framework for preference optimization, benefiting the advancement in automated alignment techniques.
Anthology ID:
2025.findings-acl.473
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9073–9085
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.findings-acl.473/
DOI:
Bibkey:
Cite (ACL):
Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Ben He, Le Sun, Jingren Zhou, and Junyang Lin. 2025. Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 9073–9085, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models (Xiang et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.findings-acl.473.pdf