ATLANTIS: Weak-to-Strong Learning via Importance Sampling

Yi Liu; Guoyin Wang; Shicheng Li; Feifan Song; Xu Sun

ATLANTIS: Weak-to-Strong Learning via Importance Sampling

Yi Liu, Guoyin Wang, Shicheng Li, Feifan Song, Xu Sun

Abstract

Supervised fine-tuning (SFT) enables large language models to align with training data for better performance in many aspects. Nevertheless, the gap between the distribution of current datasets from human annotations or model generations and the real-world data distribution heavily limits the capacities and potentials of models. As a result, we propose a new SFT technique, ATLANTIS, to bridge the gap. We adopt importance sampling to estimate the optimal data distribution in the real world from existing training datasets because the former is hard to sample from. Furthermore, we introduce an extra small model and reference model to estimate the sampling ratio through the probability gap between them. We evaluate our method with benchmarks in knowledge & understanding and preference aspects. The experiment results prove that ATLANTIS can bring consistent and significant improvements to models’ performance. What’s more, our method can be flexibly transferred among models with different structures. Our analyses demonstrate that our method is well-compatible with other SFT techniques to further enhance models’ capacities and has great potential to be combined with existing training frameworks.

Anthology ID:: 2025.acl-long.52
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1042–1052
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.52/
DOI:
Bibkey:
Cite (ACL):: Yi Liu, Guoyin Wang, Shicheng Li, Feifan Song, and Xu Sun. 2025. ATLANTIS: Weak-to-Strong Learning via Importance Sampling. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1042–1052, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: ATLANTIS: Weak-to-Strong Learning via Importance Sampling (Liu et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.52.pdf

PDF Cite Search Fix data