How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?

Shayne Longpre; Yu Wang; Chris DuBois

doi:10.18653/v1/2020.findings-emnlp.394

How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?

Abstract

Task-agnostic forms of data augmentation have proven widely effective in computer vision, even on pretrained models. In NLP similar results are reported most commonly for low data regimes, non-pretrained models, or situationally for pretrained models. In this paper we ask how effective these techniques really are when applied to pretrained transformers. Using two popular varieties of task-agnostic data augmentation (not tailored to any particular task), Easy Data Augmentation (Wei andZou, 2019) and Back-Translation (Sennrichet al., 2015), we conduct a systematic examination of their effects across 5 classification tasks, 6 datasets, and 3 variants of modern pretrained transformers, including BERT, XLNet, and RoBERTa. We observe a negative result, finding that techniques which previously reported strong improvements for non-pretrained models fail to consistently improve performance for pretrained transformers, even when training data is limited. We hope this empirical analysis helps inform practitioners where data augmentation techniques may confer improvements.

Anthology ID:: 2020.findings-emnlp.394
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2020
Month:: November
Year:: 2020
Address:: Online
Editors:: Trevor Cohn, Yulan He, Yang Liu
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4401–4411
Language:
URL:: https://aclanthology.org/2020.findings-emnlp.394
DOI:: 10.18653/v1/2020.findings-emnlp.394
Bibkey:
Cite (ACL):: Shayne Longpre, Yu Wang, and Chris DuBois. 2020. How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4401–4411, Online. Association for Computational Linguistics.
Cite (Informal):: How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers? (Longpre et al., Findings 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-2/2020.findings-emnlp.394.pdf
Video:: https://slideslive.com/38940806
Data: GLUE, MultiNLI, SST, SST-2

PDF Search Video