Bowen Dong
2022
Prompt Tuning for Discriminative Pre-trained Language Models
Yuan Yao
|
Bowen Dong
|
Ao Zhang
|
Zhengyan Zhang
|
Ruobing Xie
|
Zhiyuan Liu
|
Leyu Lin
|
Maosong Sun
|
Jianyong Wang
Findings of the Association for Computational Linguistics: ACL 2022
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks. However, to the best of our knowledge, existing works focus on prompt-tuning generative PLMs that are pre-trained to generate target tokens, such as BERT. It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned. In this work, we present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem. Comprehensive experiments on text classification and question answering show that, compared with vanilla fine-tuning, DPT achieves significantly higher performance, and also prevents the unstable problem in tuning large PLMs in both full-set and low-resource settings.
2020
Meta-Information Guided Meta-Learning for Few-Shot Relation Classification
Bowen Dong
|
Yuan Yao
|
Ruobing Xie
|
Tianyu Gao
|
Xu Han
|
Zhiyuan Liu
|
Fen Lin
|
Leyu Lin
|
Maosong Sun
Proceedings of the 28th International Conference on Computational Linguistics
Few-shot classification requires classifiers to adapt to new classes with only a few training instances. State-of-the-art meta-learning approaches such as MAML learn how to initialize and fast adapt parameters from limited instances, which have shown promising results in few-shot classification. However, existing meta-learning models solely rely on implicit instance-based statistics, and thus suffer from instance unreliability and weak interpretability. To solve this problem, we propose a novel meta-information guided meta-learning (MIML) framework, where semantic concepts of classes provide strong guidance for meta-learning in both initialization and adaptation. In effect, our model can establish connections between instance-based information and semantic-based information, which enables more effective initialization and faster adaptation. Comprehensive experimental results on few-shot relation classification demonstrate the effectiveness of the proposed framework. Notably, MIML achieves comparable or superior performance to humans with only one shot on FewRel evaluation.
Search
Co-authors
- Yuan Yao 2
- Ruobing Xie 2
- Zhiyuan Liu 2
- Leyu Lin 2
- Maosong Sun 2
- show all...