Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment

Xiaotian Zhang, Ruizhe Chen, Yang Feng, Zuozhu Liu


Abstract
Aligning language models with human preferences presents significant challenges, particularly in achieving personalization without incurring excessive computational costs. Existing methods rely on reward signals and additional annotated data, limiting their scalability and adaptability to diverse human values. To address these challenges, we introduce Persona-judge, a novel discriminative paradigm that enables training-free personalized alignment with unseen preferences. Instead of optimizing policy parameters through external reward feedback, Persona-judge leverages the intrinsic preference judgment capabilities of the model. Specifically, a draft model generates candidate tokens conditioned on a given preference, while a judge model, embodying another preference, cross-validates the predicted tokens whether to be accepted. Experimental results demonstrate that Persona-judge, using the inherent preference evaluation mechanisms of the model, offers a scalable and computationally efficient solution to personalized alignment, paving the way for more adaptive customized alignment. Our code is available here.
Anthology ID:
2025.findings-acl.260
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:
Findings | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5037–5049
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.260/
DOI:
Bibkey:
Cite (ACL):
Xiaotian Zhang, Ruizhe Chen, Yang Feng, and Zuozhu Liu. 2025. Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment. In Findings of the Association for Computational Linguistics: ACL 2025, pages 5037–5049, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment (Zhang et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.260.pdf