Named Entity Recognition Only from Word Embeddings

Ying Luo, Hai Zhao, Junlang Zhan


Abstract
Deep neural network models have helped named entity recognition achieve amazing performance without handcrafting features. However, existing systems require large amounts of human annotated training data. Efforts have been made to replace human annotations with external knowledge (e.g., NE dictionary, part-of-speech tags), while it is another challenge to obtain such effective resources. In this work, we propose a fully unsupervised NE recognition model which only needs to take informative clues from pre-trained word embeddings.We first apply Gaussian Hidden Markov Model and Deep Autoencoding Gaussian Mixture Model on word embeddings for entity span detection and type prediction, and then further design an instance selector based on reinforcement learning to distinguish positive sentences from noisy sentences and then refine these coarse-grained annotations through neural networks. Extensive experiments on two CoNLL benchmark NER datasets (CoNLL-2003 English dataset and CoNLL-2002 Spanish dataset) demonstrate that our proposed light NE recognition model achieves remarkable performance without using any annotated lexicon or corpus.
Anthology ID:
2020.emnlp-main.723
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8995–9005
Language:
URL:
https://aclanthology.org/2020.emnlp-main.723
DOI:
10.18653/v1/2020.emnlp-main.723
Bibkey:
Cite (ACL):
Ying Luo, Hai Zhao, and Junlang Zhan. 2020. Named Entity Recognition Only from Word Embeddings. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8995–9005, Online. Association for Computational Linguistics.
Cite (Informal):
Named Entity Recognition Only from Word Embeddings (Luo et al., EMNLP 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2020.emnlp-main.723.pdf
Video:
 https://slideslive.com/38938819
Data
CoNLL 2002CoNLL-2003