When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario
Chengcheng Han, Liqing Cui, Renyu Zhu, Jianing Wang, Nuo Chen, Qiushi Sun, Xiang Li, Ming Gao
Abstract
Large pre-trained language models (PLMs) have garnered significant attention for their versatility and potential for solving a wide spectrum of natural language processing (NLP) tasks. However, the cost of running these PLMs may be prohibitive. Furthermore, PLMs may not be open-sourced due to commercial considerations and potential risks of misuse, such as GPT-3. The parameters and gradients of PLMs are unavailable in this scenario.To solve the issue, black-box tuning has been proposed, which utilizes derivative-free optimization (DFO), instead of gradient descent, for training task-specific continuous prompts. However, these gradient-free methods still exhibit a significant gap compared to gradient-based methods. In this paper, we introduce gradient descent into black-box tuning scenario through knowledge distillation. Furthermore, we propose a novel method GDFO, which integrates gradient descent and derivative-free optimization to optimize task-specific continuous prompts in a harmonized manner. Experimental results show that GDFO can achieve significant performance gains over previous state-of-the-art methods.- Anthology ID:
- 2023.findings-acl.55
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 868–880
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.55
- DOI:
- Cite (ACL):
- Chengcheng Han, Liqing Cui, Renyu Zhu, Jianing Wang, Nuo Chen, Qiushi Sun, Xiang Li, and Ming Gao. 2023. When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario. In Findings of the Association for Computational Linguistics: ACL 2023, pages 868–880, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario (Han et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2023.findings-acl.55.pdf