Context-based Virtual Adversarial Training for Text Classification with Noisy Labels

Do-Myoung Lee, Yeachan Kim, Chang gyun Seo


Abstract
Deep neural networks (DNNs) have a high capacity to completely memorize noisy labels given sufficient training time, and its memorization unfortunately leads to performance degradation. Recently, virtual adversarial training (VAT) attracts attention as it could further improve the generalization of DNNs in semi-supervised learning. The driving force behind VAT is to prevent the models from overffiting to data points by enforcing consistency between the inputs and the perturbed inputs. These strategy could be helpful in learning from noisy labels if it prevents neural models from learning noisy samples while encouraging the models to generalize clean samples. In this paper, we propose context-based virtual adversarial training (ConVAT) to prevent a text classifier from overfitting to noisy labels. Unlike the previous works, the proposed method performs the adversarial training in the context level rather than the inputs. It makes the classifier not only learn its label but also its contextual neighbors, which alleviate the learning from noisy labels by preserving contextual semantics on each data point. We conduct extensive experiments on four text classification datasets with two types of label noises. Comprehensive experimental results clearly show that the proposed method works quite well even with extremely noisy settings.
Anthology ID:
2022.lrec-1.660
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
6139–6146
Language:
URL:
https://aclanthology.org/2022.lrec-1.660
DOI:
Bibkey:
Cite (ACL):
Do-Myoung Lee, Yeachan Kim, and Chang gyun Seo. 2022. Context-based Virtual Adversarial Training for Text Classification with Noisy Labels. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6139–6146, Marseille, France. European Language Resources Association.
Cite (Informal):
Context-based Virtual Adversarial Training for Text Classification with Noisy Labels (Lee et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.lrec-1.660.pdf