PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment

Zekun Moore Wang; Shenzhi Wang; King Zhu; Jiaheng Liu; Ke Xu; Jie Fu; Wangchunshu Zhou; Wenhao Huang

PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment

Zekun Moore Wang, Shenzhi Wang, King Zhu, Jiaheng Liu, Ke Xu, Jie Fu, Wangchunshu Zhou, Wenhao Huang

Abstract

Alignment of large language models (LLMs) involves training models on preference-contrastive output pairs to adjust their responses according to human preferences. To obtain such contrastive pairs, traditional methods like RLHF and RLAIF rely on limited contrasting patterns, such as varying model variants or decoding temperatures. This singularity leads to two issues: (1) alignment is not comprehensive; and thereby (2) models are susceptible to harmful response tendencies. To address these issues, we investigate how to construct more comprehensive and diversified contrasting patterns to enhance preference data (RQ1) and verify the impact of the diversification of contrasting patterns on model alignment (RQ2). For RQ1, we propose PopAlign, a framework that integrates diversified contrasting patterns across the prompt, model, and pipeline levels, introducing six contrasting strategies that do not require additional feedback labeling procedures. Regarding RQ2, we conduct thorough experiments demonstrating that PopAlign significantly outperforms existing methods, leading to more comprehensive alignment.

Anthology ID:: 2025.acl-long.1403
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 28893–28921
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1403/
DOI:
Bibkey:
Cite (ACL):: Zekun Moore Wang, Shenzhi Wang, King Zhu, Jiaheng Liu, Ke Xu, Jie Fu, Wangchunshu Zhou, and Wenhao Huang. 2025. PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 28893–28921, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment (Wang et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1403.pdf

PDF Cite Search Fix data