Text Augmentation Using Dataset Reconstruction for Low-Resource Classification

Adir Rahamim, Guy Uziel, Esther Goldbraich, Ateret Anaby Tavor


Abstract
In the deployment of real-world text classification models, label scarcity is a common problem and as the number of classes increases, this problem becomes even more complex. An approach to addressing this problem is by applying text augmentation methods. One of the more prominent methods involves using the text-generation capabilities of language models. In this paper, we propose Text AUgmentation by Dataset Reconstruction (TAU-DR), a novel method of data augmentation for text classification. We conduct experiments on several multi-class datasets, showing that our approach improves the current state-of-the-art techniques for data augmentation.
Anthology ID:
2023.findings-acl.466
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7389–7402
Language:
URL:
https://aclanthology.org/2023.findings-acl.466
DOI:
10.18653/v1/2023.findings-acl.466
Bibkey:
Cite (ACL):
Adir Rahamim, Guy Uziel, Esther Goldbraich, and Ateret Anaby Tavor. 2023. Text Augmentation Using Dataset Reconstruction for Low-Resource Classification. In Findings of the Association for Computational Linguistics: ACL 2023, pages 7389–7402, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Text Augmentation Using Dataset Reconstruction for Low-Resource Classification (Rahamim et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/2023.findings-acl.466.pdf