Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Ruiqi Zhong; Kristy Lee; Zheng Zhang; Dan Klein

doi:10.18653/v1/2021.findings-emnlp.244

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein

Abstract

Large pre-trained language models (LMs) such as GPT-3 have acquired a surprising ability to perform zero-shot learning. For example, to classify sentiment without any training examples, we can “prompt” the LM with the review and the label description “Does the user like this movie?”, and ask whether the next word is “yes” or “no”. However, the next word prediction training objective is still misaligned with the target zero-shot learning objective. To address this weakness, we propose meta-tuning, which directly optimizes the zero-shot learning objective by fine-tuning pre-trained language models on a collection of datasets. We focus on classification tasks, and construct the meta-dataset by aggregating 43 existing datasets and annotating 441 label descriptions in a question-answering (QA) format. When evaluated on unseen tasks, meta-tuned models outperform a same-sized QA model and the previous SOTA zero-shot learning system based on natural language inference. Additionally, increasing parameter count from 220M to 770M improves AUC-ROC scores by 6.3%, and we forecast that even larger models would perform better. Therefore, measuring zero-shot learning performance on language models out-of-the-box might underestimate their true potential, and community-wide efforts on aggregating datasets and unifying their formats can help build models that answer prompts better.

Anthology ID:: 2021.findings-emnlp.244
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2021
Month:: November
Year:: 2021
Address:: Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: Findings
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2856–2878
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.findings-emnlp.244/
DOI:: 10.18653/v1/2021.findings-emnlp.244
Bibkey:
Cite (ACL):: Ruiqi Zhong, Kristy Lee, Zheng Zhang, and Dan Klein. 2021. Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2856–2878, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections (Zhong et al., Findings 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.findings-emnlp.244.pdf
Software:: 2021.findings-emnlp.244.Software.zip
Video:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.findings-emnlp.244.mp4
Code: ruiqi-zhong/Meta-tuning
Data: AG News, IMDb Movie Reviews

PDF Cite Search Code Software Video Fix data