Abstract
When fine-tuning pretrained models for classification, researchers either use a generic model head or a task-specific prompt for prediction. Proponents of prompting have argued that prompts provide a method for injecting task-specific guidance, which is beneficial in low-data regimes. We aim to quantify this benefit through rigorous testing of prompts in a fair setting: comparing prompted and head-based fine-tuning in equal conditions across many tasks and data sizes. By controlling for many sources of advantage, we find that prompting does indeed provide a benefit, and that this benefit can be quantified per task. Results show that prompting is often worth 100s of data points on average across classification tasks.- Anthology ID:
- 2021.naacl-main.208
- Volume:
- Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2627–2636
- Language:
- URL:
- https://aclanthology.org/2021.naacl-main.208
- DOI:
- 10.18653/v1/2021.naacl-main.208
- Award:
- Outstanding Short Paper
- Cite (ACL):
- Teven Le Scao and Alexander Rush. 2021. How many data points is a prompt worth?. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2627–2636, Online. Association for Computational Linguistics.
- Cite (Informal):
- How many data points is a prompt worth? (Le Scao & Rush, NAACL 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.naacl-main.208.pdf
- Code
- TevenLeScao/pet
- Data
- BoolQ, COPA, MultiNLI, MultiRC, SuperGLUE, WSC