RankMean: Module-Level Importance Score for Merging Fine-tuned LLM Models
Gabriel Perin, Xuxi Chen, Shusen Liu, Bhavya Kailkhura, Zhangyang Wang, Brian Gallagher
Abstract
Traditionally, developing new language models (LMs) capable of addressing multiple tasks involves fine-tuning pre-trained LMs using a wide collection of datasets, a process that often incurs significant computational expenses. Model merging emerges as a cost-effective alternative, allowing the integration of existing models fine-tuned on different tasks into a single model that performs well across all tasks, eliminating the need for additional training. In this paper, we propose RankMean, an algorithm for merging fine-tuned LMs without requiring any downstream data. RankMean determines merging coefficients based on the relative rankings of weight change magnitudes and applies these coefficients for module-wise integration of various fine-tuned models. Our experimental results demonstrate that RankMean outperforms existing baseline methods on multiple benchmarks. The code is available at https://github.com/VITA-Group/RankMean.- Anthology ID:
- 2024.findings-acl.104
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1776–1782
- Language:
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.findings-acl.104/
- DOI:
- 10.18653/v1/2024.findings-acl.104
- Cite (ACL):
- Gabriel Perin, Xuxi Chen, Shusen Liu, Bhavya Kailkhura, Zhangyang Wang, and Brian Gallagher. 2024. RankMean: Module-Level Importance Score for Merging Fine-tuned LLM Models. In Findings of the Association for Computational Linguistics: ACL 2024, pages 1776–1782, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- RankMean: Module-Level Importance Score for Merging Fine-tuned LLM Models (Perin et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.findings-acl.104.pdf