Fu Mingsheng


2023

pdf
Adaptive Textual Label Noise Learning based on Pre-trained Models
Shaohuan Cheng | Wenyu Chen | Fu Mingsheng | Xuanting Xie | Hong Qu
Findings of the Association for Computational Linguistics: EMNLP 2023

The label noise in real-world scenarios is unpredictable and can even be a mixture of different types of noise. To meet this challenge, we develop an adaptive textual label noise learning framework based on pre-trained models, which consists of an adaptive warm-up stage and a hybrid training stage. Specifically, an early stopping method, relying solely on the training set, is designed to dynamically terminate the warm-up process based on the model’s fit level to different noise scenarios. The hybrid training stage incorporates several generalization strategies to gradually correct mislabeled instances, thereby making better use of noisy data. Experiments on multiple datasets demonstrate that our approach performs comparably or even surpasses the state-of-the-art methods in various noise scenarios, including scenarios with the mixture of multiple types of noise.