GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning

Kaiyuan Tian; Linbo Qiao; Yu Tang; Gongqingjian Jiang; Baihui Liu; Yifu Gao; Xialin Su; Dongsheng Li

GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning

Kaiyuan Tian, Linbo Qiao, Yu Tang, Gongqingjian Jiang, Baihui Liu, Yifu Gao, Xialin Su, Dongsheng Li

Abstract

Full-parameter fine-tuning of large language models is constrained by substantial GPU memory demands. Low-rank adaptation methods mitigate this challenge by updating only a subset of parameters. However, these approaches often limit model expressiveness and yield lower performance than full-parameter fine-tuning. Layer-wise fine-tuning methods have emerged as an alternative, enabling memory-efficient training through static layer importance sampling strategies. However, these methods overlook variations in layer importance across tasks and training stages, resulting in suboptimal performance on downstream tasks. To address these limitations, we propose GRASS, a gradient-based adaptive layer-wise importance sampling framework. GRASS utilizes mean gradient norms as a task-aware and training-stage-aware metric for estimating layer importance. Furthermore, GRASS adaptively adjusts layer sampling probabilities through an adaptive training strategy. We also introduce a layer-wise optimizer state offloading mechanism to further reduce memory usage while maintaining comparable training throughput. Extensive experiments across multiple models and benchmarks demonstrate that GRASS consistently outperforms state-of-the-art methods, achieving an average accuracy improvement of up to 4.38 points and reducing memory usage by up to 19.97%.

Anthology ID:: 2026.findings-acl.475
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9777–9790
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.475/
DOI:
Bibkey:
Cite (ACL):: Kaiyuan Tian, Linbo Qiao, Yu Tang, Gongqingjian Jiang, Baihui Liu, Yifu Gao, Xialin Su, and Dongsheng Li. 2026. GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning. In Findings of the Association for Computational Linguistics: ACL 2026, pages 9777–9790, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning (Tian et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.475.pdf
Checklist:: 2026.findings-acl.475.checklist.pdf

PDF Cite Search Checklist Fix data