Continued Pretraining for Better Zero- and Few-Shot Promptability
Zhaofeng Wu, Robert L Logan IV, Pete Walsh, Akshita Bhagia, Dirk Groeneveld, Sameer Singh, Iz Beltagy
Abstract
Recently introduced language model prompting methods can achieve high accuracy in zero- and few-shot settings while requiring few to no learned task-specific parameters. Nevertheless, these methods still often trail behind full model finetuning. In this work, we investigate if a dedicated continued pretraining stage could improve “promptability”, i.e., zero-shot performance with natural language prompts or few-shot performance with prompt tuning. We reveal settings where existing continued pretraining methods lack promptability. We also identify current methodological gaps, which we fill with thorough large-scale experiments. We demonstrate that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings compared to existing methods, up to 31% relative. On the other hand, we find that continued pretraining using MAML-style meta-learning, a method that directly optimizes few-shot promptability, yields subpar performance. We validate our findings with two prompt tuning methods, and, based on our results, we provide concrete recommendations to optimize promptability for different use cases.- Anthology ID:
- 2022.emnlp-main.300
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4517–4531
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.300
- DOI:
- 10.18653/v1/2022.emnlp-main.300
- Cite (ACL):
- Zhaofeng Wu, Robert L Logan IV, Pete Walsh, Akshita Bhagia, Dirk Groeneveld, Sameer Singh, and Iz Beltagy. 2022. Continued Pretraining for Better Zero- and Few-Shot Promptability. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4517–4531, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Continued Pretraining for Better Zero- and Few-Shot Promptability (Wu et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.emnlp-main.300.pdf