@inproceedings{vu-etal-2021-strata,
    title = "{ST}ra{TA}: Self-Training with Task Augmentation for Better Few-shot Learning",
    author = "Vu, Tu  and
      Luong, Minh-Thang  and
      Le, Quoc  and
      Simon, Grady  and
      Iyyer, Mohit",
    editor = "Moens, Marie-Francine  and
      Huang, Xuanjing  and
      Specia, Lucia  and
      Yih, Scott Wen-tau",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2021.emnlp-main.462/",
    doi = "10.18653/v1/2021.emnlp-main.462",
    pages = "5715--5731",
    abstract = "Despite their recent successes in tackling many NLP tasks, large-scale pre-trained language models do not perform as well in few-shot settings where only a handful of training examples are available. To address this shortcoming, we propose STraTA, which stands for Self-Training with Task Augmentation, an approach that builds on two key ideas for effective leverage of unlabeled data. First, STraTA uses task augmentation, a novel technique that synthesizes a large amount of data for auxiliary-task fine-tuning from target-task unlabeled texts. Second, STraTA performs self-training by further fine-tuning the strong base model created by task augmentation on a broad distribution of pseudo-labeled data. Our experiments demonstrate that STraTA can substantially improve sample efficiency across 12 few-shot benchmarks. Remarkably, on the SST-2 sentiment dataset, STraTA, with only 8 training examples per class, achieves comparable results to standard fine-tuning with 67K training examples. Our analyses reveal that task augmentation and self-training are both complementary and independently effective."
}Markdown (Informal)
[STraTA: Self-Training with Task Augmentation for Better Few-shot Learning](https://preview.aclanthology.org/ingest-emnlp/2021.emnlp-main.462/) (Vu et al., EMNLP 2021)
ACL