Better Modeling of Incomplete Annotations for Named Entity Recognition

Zhanming Jie, Pengjun Xie, Wei Lu, Ruixue Ding, Linlin Li


Abstract
Supervised approaches to named entity recognition (NER) are largely developed based on the assumption that the training data is fully annotated with named entity information. However, in practice, annotated data can often be imperfect with one typical issue being the training data may contain incomplete annotations. We highlight several pitfalls associated with learning under such a setup in the context of NER and identify limitations associated with existing approaches, proposing a novel yet easy-to-implement approach for recognizing named entities with incomplete data annotations. We demonstrate the effectiveness of our approach through extensive experiments.
Anthology ID:
N19-1079
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
729–734
Language:
URL:
https://aclanthology.org/N19-1079
DOI:
10.18653/v1/N19-1079
Bibkey:
Cite (ACL):
Zhanming Jie, Pengjun Xie, Wei Lu, Ruixue Ding, and Linlin Li. 2019. Better Modeling of Incomplete Annotations for Named Entity Recognition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 729–734, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Better Modeling of Incomplete Annotations for Named Entity Recognition (Jie et al., NAACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/N19-1079.pdf
Supplementary:
 N19-1079.Supplementary.zip
Presentation:
 N19-1079.Presentation.pdf
Video:
 https://vimeo.com/360565437
Data
CoNLL 2002CoNLL 2003