Reinforced Active Learning for Low-Resource, Domain-Specific, Multi-Label Text Classification

Lukas Wertz, Jasmina Bogojeska, Katsiaryna Mirylenka, Jonas Kuhn


Abstract
Text classification datasets from specialised or technical domains are in high demand, especially in industrial applications. However, due to the high cost of annotation such datasets are usually expensive to create. While Active Learning (AL) can reduce the labeling cost, required AL strategies are often only tested on general knowledge domains and tend to use information sources that are not consistent across tasks. We propose Reinforced Active Learning (RAL) to train a Reinforcement Learning policy that utilizes many different aspects of the data and the task in order to select the most informative unlabeled subset dynamically over the course of the AL procedure. We demonstrate the superior performance of the proposed RAL framework compared to strong AL baselines across four intricate multi-class, multi-label text classification datasets taken from specialised domains. In addition, we experiment with a unique data augmentation approach to further reduce the number of samples RAL needs to annotate.
Anthology ID:
2023.findings-acl.697
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10959–10977
Language:
URL:
https://aclanthology.org/2023.findings-acl.697
DOI:
10.18653/v1/2023.findings-acl.697
Bibkey:
Cite (ACL):
Lukas Wertz, Jasmina Bogojeska, Katsiaryna Mirylenka, and Jonas Kuhn. 2023. Reinforced Active Learning for Low-Resource, Domain-Specific, Multi-Label Text Classification. In Findings of the Association for Computational Linguistics: ACL 2023, pages 10959–10977, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Reinforced Active Learning for Low-Resource, Domain-Specific, Multi-Label Text Classification (Wertz et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2023.findings-acl.697.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-5/2023.findings-acl.697.mp4