Abstract
Training with noisy labelled data is known to be detrimental to model performance, especially for high-capacity neural network models in low-resource domains. Our experiments suggest that standard regularisation strategies, such as weight decay and dropout, are ineffective in the face of noisy labels. We propose a simple noisy label detection method that prevents error propagation from the input layer. The approach is based on the observation that the projection of noisy labels is learned through memorisation at advanced stages of learning, and that the Pearson correlation is sensitive to outliers. Extensive experiments over real-world human-disagreement annotations as well as randomly-corrupted and data-augmented labels, across various tasks and domains, demonstrate that our method is effective, regularising noisy labels and improving generalisation performance.- Anthology ID:
- 2022.coling-1.371
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 4228–4240
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2022.coling-1.371/
- DOI:
- Cite (ACL):
- Yuxia Wang, Timothy Baldwin, and Karin Verspoor. 2022. Noisy Label Regularisation for Textual Regression. In Proceedings of the 29th International Conference on Computational Linguistics, pages 4228–4240, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- Noisy Label Regularisation for Textual Regression (Wang et al., COLING 2022)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2022.coling-1.371.pdf
- Code
- yuxiaw/regularise-regression-noisy-labels
- Data
- IMDb Movie Reviews, PeerRead