BBTv2: Towards a Gradient-Free Future with Large Language Models

Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu


Abstract
Most downstream adaptation methods tune all or part of the parameters of pre-trained models (PTMs) through gradient descent, where the tuning cost increases linearly with the growth of the model size.By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment.Though, past work on gradient-free tuning often introduces gradient descent to seek a good initialization of prompt and lacks versatility across tasks and PTMs.In this paper, we present BBTv2, an improved version of Black-Box Tuning, to drive PTMs for few-shot learning.We prepend continuous prompts to every layer of the PTM and propose a divide-and-conquer gradient-free algorithm to optimize the prompts at different layers alternately.Extensive experiments across various tasks and PTMs show that BBTv2 can achieve comparable performance to full model tuning and state-of-the-art parameter-efficient methods (e.g., Adapter, LoRA, BitFit, etc.) under few-shot settings while maintaining much fewer tunable parameters.
Anthology ID:
2022.emnlp-main.259
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3916–3930
Language:
URL:
https://aclanthology.org/2022.emnlp-main.259
DOI:
10.18653/v1/2022.emnlp-main.259
Bibkey:
Cite (ACL):
Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, and Xipeng Qiu. 2022. BBTv2: Towards a Gradient-Free Future with Large Language Models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3916–3930, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
BBTv2: Towards a Gradient-Free Future with Large Language Models (Sun et al., EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/2022.emnlp-main.259.pdf