Alignment Data Map for Efficient Preference Data Selection and Diagnosis

Seohyeong Lee; Eunwon Kim; Hwaran Lee; Buru Chang

Alignment Data Map for Efficient Preference Data Selection and Diagnosis

Seohyeong Lee, Eunwon Kim, Hwaran Lee, Buru Chang

Abstract

Human preference data is essential for aligning large language models (LLMs) with human values, but collecting such data is often costly and inefficient-motivating the need for efficient data selection methods that reduce annotation costs while preserving alignment effectiveness. To address this issue, we propose Alignment Data Map, a data analysis tool for identifying and selecting effective preference data. We first evaluate alignment scores of the preference data by LLM-as-a-judge, explicit reward model, and reference-based approaches. The Alignment Data Map considers both response quality and inter-response variability based on the alignment scores. From our experimental findings, training on only 33% of samples that exhibit high-quality and low-variability, achieves comparable or superior alignment performance on MT-Bench, Evol-Instruct, and AlpacaEval, compared to training with the full dataset. In addition, Alignment Data Map detects potential label misannotations by analyzing correlations between annotated labels and alignment scores, improving annotation accuracy. The implementation is available at https://github.com/01choco/Alignment-Data-Map.

Anthology ID:: 2026.findings-acl.1906
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 38225–38241
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1906/
DOI:
Bibkey:
Cite (ACL):: Seohyeong Lee, Eunwon Kim, Hwaran Lee, and Buru Chang. 2026. Alignment Data Map for Efficient Preference Data Selection and Diagnosis. In Findings of the Association for Computational Linguistics: ACL 2026, pages 38225–38241, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Alignment Data Map for Efficient Preference Data Selection and Diagnosis (Lee et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1906.pdf
Checklist:: 2026.findings-acl.1906.checklist.pdf

PDF Cite Search Checklist Fix data