Abstract
Finetuning large pre-trained language models with a task-specific head has advanced the state-of-the-art on many natural language understanding benchmarks. However, models with a task-specific head require a lot of training data, making them susceptible to learning and exploiting dataset-specific superficial cues that do not generalize to other datasets.Prompting has reduced the data requirement by reusing the language model head and formatting the task input to match the pre-training objective. Therefore, it is expected that few-shot prompt-based models do not exploit superficial cues.This paper presents an empirical examination of whether few-shot prompt-based models also exploit superficial cues.Analyzing few-shot prompt-based models on MNLI, SNLI, HANS, and COPA has revealed that prompt-based models also exploit superficial cues. While the models perform well on instances with superficial cues, they often underperform or only marginally outperform random accuracy on instances without superficial cues.- Anthology ID:
- 2022.acl-long.166
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2333–2352
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.166
- DOI:
- 10.18653/v1/2022.acl-long.166
- Cite (ACL):
- Pride Kavumba, Ryo Takahashi, and Yusuke Oda. 2022. Are Prompt-based Models Clueless?. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2333–2352, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Are Prompt-based Models Clueless? (Kavumba et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2022.acl-long.166.pdf
- Data
- GLUE, MultiNLI, SNLI, SuperGLUE