STraTA: Self-Training with Task Augmentation for Better Few-shot Learning

Tu Vu; Minh-Thang Luong; Quoc Le; Grady Simon; Mohit Iyyer

doi:10.18653/v1/2021.emnlp-main.462

STraTA: Self-Training with Task Augmentation for Better Few-shot Learning

Tu Vu, Minh-Thang Luong, Quoc Le, Grady Simon, Mohit Iyyer

Abstract

Despite their recent successes in tackling many NLP tasks, large-scale pre-trained language models do not perform as well in few-shot settings where only a handful of training examples are available. To address this shortcoming, we propose STraTA, which stands for Self-Training with Task Augmentation, an approach that builds on two key ideas for effective leverage of unlabeled data. First, STraTA uses task augmentation, a novel technique that synthesizes a large amount of data for auxiliary-task fine-tuning from target-task unlabeled texts. Second, STraTA performs self-training by further fine-tuning the strong base model created by task augmentation on a broad distribution of pseudo-labeled data. Our experiments demonstrate that STraTA can substantially improve sample efficiency across 12 few-shot benchmarks. Remarkably, on the SST-2 sentiment dataset, STraTA, with only 8 training examples per class, achieves comparable results to standard fine-tuning with 67K training examples. Our analyses reveal that task augmentation and self-training are both complementary and independently effective.

Anthology ID:: 2021.emnlp-main.462
Volume:: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2021
Address:: Online and Punta Cana, Dominican Republic
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5715–5731
Language:
URL:: https://aclanthology.org/2021.emnlp-main.462
DOI:: 10.18653/v1/2021.emnlp-main.462
Bibkey:
Cite (ACL):: Tu Vu, Minh-Thang Luong, Quoc Le, Grady Simon, and Mohit Iyyer. 2021. STraTA: Self-Training with Task Augmentation for Better Few-shot Learning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5715–5731, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: STraTA: Self-Training with Task Augmentation for Better Few-shot Learning (Vu et al., EMNLP 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/paclic-22-ingestion/2021.emnlp-main.462.pdf
Video:: https://preview.aclanthology.org/paclic-22-ingestion/2021.emnlp-main.462.mp4
Code: google-research/google-research
Data: GLUE, MRPC, MultiNLI, QNLI, SNLI, SST

PDF Search Code Video