Model Extrapolation Expedites Alignment

Chujie Zheng; Ziqi Wang; Heng Ji; Minlie Huang; Nanyun Peng

Model Extrapolation Expedites Alignment

Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, Nanyun Peng

Abstract

Given the high computational cost of preference alignment training of large language models (LLMs), exploring efficient methods to reduce the training overhead remains an important and compelling research problem. Motivated by the observation that alignment training typically involves only small parameter changes without injecting new knowledge into models, we propose a straightforward method called ExPO (model extrapolation) to expedite LLMs’ alignment with human preferences. Given a partially-trained model and its initial SFT checkpoint, ExPO improves the implicit optimization objective of alignment training by simply amplifying the parameter change based on a first-order approximation, without any additional training overhead. Through controlled experiments, we demonstrate that ExPO boosts a DPO model trained with only 20% steps to outperform the fully-trained one. Moreover, we show that ExPO notably improves existing open-source LLMs (ranging from 1.8B to 70B parameters) on the leading AlpacaEval 2.0 and MT-Bench benchmarks, which highlights ExPO’s broader utility in efficiently enhancing LLM alignment.

Anthology ID:: 2025.acl-long.51
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1025–1041
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.51/
DOI:
Bibkey:
Cite (ACL):: Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, and Nanyun Peng. 2025. Model Extrapolation Expedites Alignment. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1025–1041, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Model Extrapolation Expedites Alignment (Zheng et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.51.pdf

PDF Cite Search Fix data