Prompt Tuning for Discriminative Pre-trained Language Models
Yuan Yao, Bowen Dong, Ao Zhang, Zhengyan Zhang, Ruobing Xie, Zhiyuan Liu, Leyu Lin, Maosong Sun, Jianyong Wang
Abstract
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks. However, to the best of our knowledge, existing works focus on prompt-tuning generative PLMs that are pre-trained to generate target tokens, such as BERT. It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned. In this work, we present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem. Comprehensive experiments on text classification and question answering show that, compared with vanilla fine-tuning, DPT achieves significantly higher performance, and also prevents the unstable problem in tuning large PLMs in both full-set and low-resource settings.- Anthology ID:
- 2022.findings-acl.273
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2022
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3468–3473
- Language:
- URL:
- https://aclanthology.org/2022.findings-acl.273
- DOI:
- 10.18653/v1/2022.findings-acl.273
- Cite (ACL):
- Yuan Yao, Bowen Dong, Ao Zhang, Zhengyan Zhang, Ruobing Xie, Zhiyuan Liu, Leyu Lin, Maosong Sun, and Jianyong Wang. 2022. Prompt Tuning for Discriminative Pre-trained Language Models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3468–3473, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Prompt Tuning for Discriminative Pre-trained Language Models (Yao et al., Findings 2022)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2022.findings-acl.273.pdf
- Code
- thunlp/dpt
- Data
- AG News, Quoref, SST, SST-2, SST-5