Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models
Hongbang Yuan, Yubo Chen, Pengfei Cao, Zhuoran Jin, Kang Liu
Abstract
Large language models (LLMs) have achieved remarkable success but still tend to generate factually erroneous responses, a phenomenon known as hallucination. A recent trend is to use preference learning to fine-tune models to align with factuality. However, existing work primarily evaluates fine-tuned models on in-domain (ID) datasets and the factuality on out-of-domain (OOD) datasets remains underexplored. In this paper, we conduct a comprehensive evaluation of the factuality of different models tuned by various preference learning algorithms and demonstrate that their performance on OOD datasets either increases minimally or decreases. Subsequently, we reveal that the main cause of model’s failure to uphold factuality under a distribution shift is under-alignment, rather than over-alignment, by analyzing the token distribution shift of the models before and after tuning. Finally, we propose APEFT (Atomic Preference Enhanced Factuality Tuning), a framework that enhances model’s awareness of factuality at the granularity of individual facts. Extensive experiments demonstrate that APEFT improves model performance by an average of on both ID and OOD datasets, which is highly effective.- Anthology ID:
- 2025.findings-naacl.354
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2025
- Month:
- April
- Year:
- 2025
- Address:
- Albuquerque, New Mexico
- Editors:
- Luis Chiruzzo, Alan Ritter, Lu Wang
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6310–6323
- Language:
- URL:
- https://preview.aclanthology.org/Author-page-Marten-During-lu/2025.findings-naacl.354/
- DOI:
- Cite (ACL):
- Hongbang Yuan, Yubo Chen, Pengfei Cao, Zhuoran Jin, and Kang Liu. 2025. Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 6310–6323, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models (Yuan et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/Author-page-Marten-During-lu/2025.findings-naacl.354.pdf