AdaK-NER: An Adaptive Top-K Approach for Named Entity Recognition with Incomplete Annotations

Hongtao Ruan, Liying Zheng, Peixian Hu


Abstract
State-of-the-art Named Entity Recognition (NER) models rely heavily on large amounts of fully annotated training data. However, accessible data are often incompletely annotated since the annotators usually lack comprehensive knowledge in the target domain. Normally the unannotated tokens are regarded as non-entities by default, while we underline that these tokens could either be non-entities or part of any entity. Here, we study NER modeling with incomplete annotated data where only a fraction of the named entities are labeled, and the unlabeled tokens are equivalently multi-labeled by every possible label. Taking multi-labeled tokens into account, the numerous possible paths can distract the training model from the gold path (ground truth label sequence), and thus hinders the learning ability. In this paper, we propose AdaK-NER, named the adaptive top-K approach, to help the model focus on a smaller feasible region where the gold path is more likely to be located. We demonstrate the superiority of our approach through extensive experiments on both English and Chinese datasets, averagely improving 2% in F-score on the CoNLL-2003 and over 10% on two Chinese datasets compared with the prior state-of-the-art works.
Anthology ID:
2022.finnlp-1.26
Volume:
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen
Venue:
FinNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
196–202
Language:
URL:
https://aclanthology.org/2022.finnlp-1.26
DOI:
10.18653/v1/2022.finnlp-1.26
Bibkey:
Cite (ACL):
Hongtao Ruan, Liying Zheng, and Peixian Hu. 2022. AdaK-NER: An Adaptive Top-K Approach for Named Entity Recognition with Incomplete Annotations. In Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP), pages 196–202, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
AdaK-NER: An Adaptive Top-K Approach for Named Entity Recognition with Incomplete Annotations (Ruan et al., FinNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2022.finnlp-1.26.pdf