A Multi-Level Optimization Framework for End-to-End Text Augmentation

Sai Ashish Somayajula; Linfeng Song; Pengtao Xie

doi:10.1162/tacl_a_00464

A Multi-Level Optimization Framework for End-to-End Text Augmentation

Sai Ashish Somayajula, Linfeng Song, Pengtao Xie

Abstract

Text augmentation is an effective technique in alleviating overfitting in NLP tasks. In existing methods, text augmentation and downstream tasks are mostly performed separately. As a result, the augmented texts may not be optimal to train the downstream model. To address this problem, we propose a three-level optimization framework to perform text augmentation and the downstream task end-to- end. The augmentation model is trained in a way tailored to the downstream task. Our framework consists of three learning stages. A text summarization model is trained to perform data augmentation at the first stage. Each summarization example is associated with a weight to account for its domain difference with the text classification data. At the second stage, we use the model trained at the first stage to perform text augmentation and train a text classification model on the augmented texts. At the third stage, we evaluate the text classification model trained at the second stage and update weights of summarization examples by minimizing the validation loss. These three stages are performed end-to-end. We evaluate our method on several text classification datasets where the results demonstrate the effectiveness of our method. Code is available at https://github.com/Sai-Ashish/End-to-End-Text-Augmentation.

Anthology ID:: 2022.tacl-1.20
Volume:: Transactions of the Association for Computational Linguistics, Volume 10
Month:
Year:: 2022
Address:: Cambridge, MA
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 343–358
Language:
URL:: https://aclanthology.org/2022.tacl-1.20
DOI:: 10.1162/tacl_a_00464
Bibkey:
Cite (ACL):: Sai Ashish Somayajula, Linfeng Song, and Pengtao Xie. 2022. A Multi-Level Optimization Framework for End-to-End Text Augmentation. Transactions of the Association for Computational Linguistics, 10:343–358.
Cite (Informal):: A Multi-Level Optimization Framework for End-to-End Text Augmentation (Somayajula et al., TACL 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-script-update/2022.tacl-1.20.pdf
Video:: https://preview.aclanthology.org/ingestion-script-update/2022.tacl-1.20.mp4

PDF Search Video