Self-Guided Alignment: Adaptive Preference Sensing for Multi-Objective Generation

Ning Wang, Zhanyang Liu, Taotao Zhou, Xinrui Zhang, Zongru Shao, Haojie Zhou


Abstract
Aligning Large Language Models (LLMs) with diverse and potentially conflicting human values necessitates navigating complex multi-objective landscapes. However, existing prompt-conditioned approaches face a critical training-inference discrepancy: they rely on ground-truth scores during training while requiring manual user-specification at inference. We introduce prediction of implicit preferences to bridge this gap while reducing user burden. To this end, we propose Self-Guided Alignment (SGA), a framework that transforms passive reward dependency into an intrinsic adaptive sensing capability. It employs a dual-head architecture to unify preference internalization with conditional generation, enabling the model to learn a latent mapping between raw prompts and preference profiles. Through adaptive preference sensing, the model autonomously predicts the latent preference score to self-guide the generation, thereby eliminating the need for manual specification at inference. Extensive experiments across diverse model scales demonstrate that SGA often outperforms state-of-the-art baselines, achieving superior multi-objective trade-offs and improved preference alignment. Code is available at https://github.com/python-yyds/SGA.
Anthology ID:
2026.acl-long.2184
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
47202–47220
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.2184/
DOI:
Bibkey:
Cite (ACL):
Ning Wang, Zhanyang Liu, Taotao Zhou, Xinrui Zhang, Zongru Shao, and Haojie Zhou. 2026. Self-Guided Alignment: Adaptive Preference Sensing for Multi-Objective Generation. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 47202–47220, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Self-Guided Alignment: Adaptive Preference Sensing for Multi-Objective Generation (Wang et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.2184.pdf
Checklist:
 2026.acl-long.2184.checklist.pdf