Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets

Dongyue Li; Ziniu Zhang; Lu Wang; Hongyang R. Zhang

Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets

Dongyue Li, Ziniu Zhang, Lu Wang, Hongyang R. Zhang

Abstract

This paper develops an ensemble method for fine-tuning a language model to multiple datasets. Existing methods, such as quantized LoRA (QLoRA), are efficient when adapting to a single dataset. When training on multiple datasets of different tasks, a common setup in practice, it remains unclear how to design an efficient adaptation for fine-tuning language models. We propose to use an ensemble of multiple smaller adapters instead of a single adapter per task. We design an efficient algorithm that partitions n datasets into m groups, where m is typically much smaller than n in practice, and train one adapter for each group before taking a weighted combination to form the ensemble. The algorithm leverages a first-order approximation property of low-rank adaptation to quickly obtain the fine-tuning performances of dataset combinations since methods like LoRA stay close to the base model. Hence, we use the gradients of the base model to estimate its behavior during fine-tuning. Empirically, this approximation holds with less than 1% error on models with up to 34 billion parameters, leading to an estimation of true fine-tuning performances under 5% error while speeding up computation compared to base fine-tuning by 105 times. When applied to fine-tune Llama and GPT models on ten text classification tasks, our approach provides up to 10% higher average test accuracy over QLoRA, with only 9% more FLOPs. On a Llama model with 34 billion parameters, an ensemble of QLoRA increases test accuracy by 3% compared to QLoRA, with only 8% more FLOPs.

Anthology ID:: 2025.acl-long.1231
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25347–25364
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1231/
DOI:
Bibkey:
Cite (ACL):: Dongyue Li, Ziniu Zhang, Lu Wang, and Hongyang R. Zhang. 2025. Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 25347–25364, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets (Li et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1231.pdf

PDF Cite Search Fix data