UseClean: learning from complex noisy labels in named entity recognition
Jinjin Tian, Kun Zhou, Meiguo Wang, Yu Zhang, Benjamin Yao, Xiaohu Liu, Chenlei Guo
Abstract
We investigate and refine denoising methods for NER task on data that potentially contains extremely noisy labels from multi-sources. In this paper, we first summarized all possible noise types and noise generation schemes, based on which we built a thorough evaluation system. We then pinpoint the bottleneck of current state-of-art denoising methods using our evaluation system. Correspondingly, we propose several refinements, including using a two-stage framework to avoid error accumulation; a novel confidence score utilizing minimal clean supervision to increase predictive power; an automatic cutoff fitting to save extensive hyper-parameter tuning; a warm started weighted partial CRF to better learn on the noisy tokens. Additionally, we propose to use adaptive sampling to further boost the performance in long-tailed entity settings. Our method improves F1 score by on average at least 5 10% over current state-of-art across extensive experiments.- Anthology ID:
- 2023.clasp-1.14
- Volume:
- Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD)
- Month:
- September
- Year:
- 2023
- Address:
- Gothenburg, Sweden
- Editors:
- Ellen Breitholtz, Shalom Lappin, Sharid Loaiciga, Nikolai Ilinykh, Simon Dobnik
- Venue:
- CLASP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 120–130
- Language:
- URL:
- https://aclanthology.org/2023.clasp-1.14
- DOI:
- Cite (ACL):
- Jinjin Tian, Kun Zhou, Meiguo Wang, Yu Zhang, Benjamin Yao, Xiaohu Liu, and Chenlei Guo. 2023. UseClean: learning from complex noisy labels in named entity recognition. In Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD), pages 120–130, Gothenburg, Sweden. Association for Computational Linguistics.
- Cite (Informal):
- UseClean: learning from complex noisy labels in named entity recognition (Tian et al., CLASP 2023)
- PDF:
- https://preview.aclanthology.org/teach-a-man-to-fish/2023.clasp-1.14.pdf