Don’t Forget Your Reward Values: Language Model Alignment via Value-based Calibration
Xin Mao, Feng-Lin Li, Huimin Xu, Wei Zhang, Wang Chen, Anh Tuan Luu
Abstract
While Reinforcement Learning from Human Feedback (RLHF) significantly enhances the generation quality of Large Language Models (LLMs), recent studies have raised concerns regarding the complexity and instability associated with the Proximal Policy Optimization (PPO) algorithm, proposing a series of order-based alignment methods as viable alternatives. This paper delves into existing order-based methods, unifying them into one framework and examining their inefficiencies in utilizing reward values. Building upon these findings, we propose a new Value-based Calibration (VCB) method to better align LLMs with human preferences. Experimental results demonstrate that VCB surpasses existing alignment methods on AI assistant and summarization datasets, providing impressive generalizability, robustness, and diversity in different settings.- Anthology ID:
- 2024.emnlp-main.976
- Volume:
- Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 17622–17642
- Language:
- URL:
- https://preview.aclanthology.org/add-emnlp-2024-awards/2024.emnlp-main.976/
- DOI:
- 10.18653/v1/2024.emnlp-main.976
- Cite (ACL):
- Xin Mao, Feng-Lin Li, Huimin Xu, Wei Zhang, Wang Chen, and Anh Tuan Luu. 2024. Don’t Forget Your Reward Values: Language Model Alignment via Value-based Calibration. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 17622–17642, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Don’t Forget Your Reward Values: Language Model Alignment via Value-based Calibration (Mao et al., EMNLP 2024)
- PDF:
- https://preview.aclanthology.org/add-emnlp-2024-awards/2024.emnlp-main.976.pdf