Parameter-efficient Tuning for Large Language Model without Calculating Its Gradients

Feihu Jin, Jiajun Zhang, Chengqing Zong


Abstract
Fine-tuning all parameters of large language models (LLMs) requires significant computational resources and is time-consuming. Recent parameter-efficient tuning methods such as Adapter tuning, Prefix tuning, and LoRA allow for updating a small subset of parameters in large language models. However, they can only save approximately 30% of the training memory requirements, due to the problem that gradient computation and backpropagation are still necessary for these methods. This paper proposes a novel parameter-efficient tuning method for LLMs without calculating their gradients. Leveraging the discernible similarities between the parameter-efficient modules of the same task learned by both large and small language models, we put forward a strategy for transferring the parameter-efficient modules, originally derived from small language models to much larger ones. To ensure a smooth and effective adaptation process, we further introduce a Bridge model to guarantee dimensional consistency while also stimulating a dynamic interaction between the models. We demonstrate the effectiveness of our method using the T5 and GPT-2 series of language models on the SuperGLUE benchmark. Our method achieves comparable performance to both fine-tuning and parameter-efficient tuning on large language models without needing gradient-based optimization. Additionally, our method achieves up to 5.7x memory reduction compared to parameter-efficient tuning.
Anthology ID:
2023.emnlp-main.22
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
321–330
Language:
URL:
https://aclanthology.org/2023.emnlp-main.22
DOI:
10.18653/v1/2023.emnlp-main.22
Bibkey:
Cite (ACL):
Feihu Jin, Jiajun Zhang, and Chengqing Zong. 2023. Parameter-efficient Tuning for Large Language Model without Calculating Its Gradients. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 321–330, Singapore. Association for Computational Linguistics.
Cite (Informal):
Parameter-efficient Tuning for Large Language Model without Calculating Its Gradients (Jin et al., EMNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.emnlp-main.22.pdf
Video:
 https://preview.aclanthology.org/emnlp-22-attachments/2023.emnlp-main.22.mp4