An Empirical Study on Fine-Grained Named Entity Recognition

Khai Mai, Thai-Hoang Pham, Minh Trung Nguyen, Tuan Duc Nguyen, Danushka Bollegala, Ryohei Sasano, Satoshi Sekine


Abstract
Named entity recognition (NER) has attracted a substantial amount of research. Recently, several neural network-based models have been proposed and achieved high performance. However, there is little research on fine-grained NER (FG-NER), in which hundreds of named entity categories must be recognized, especially for non-English languages. It is still an open question whether there is a model that is robust across various settings or the proper model varies depending on the language, the number of named entity categories, and the size of training datasets. This paper first presents an empirical comparison of FG-NER models for English and Japanese and demonstrates that LSTM+CNN+CRF (Ma and Hovy, 2016), one of the state-of-the-art methods for English NER, also works well for English FG-NER but does not work well for Japanese, a language that has a large number of character types. To tackle this problem, we propose a method to improve the neural network-based Japanese FG-NER performance by removing the CNN layer and utilizing dictionary and category embeddings. Experiment results show that the proposed method improves Japanese FG-NER F-score from 66.76% to 75.18%.
Anthology ID:
C18-1060
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
711–722
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/C18-1060/
DOI:
Bibkey:
Cite (ACL):
Khai Mai, Thai-Hoang Pham, Minh Trung Nguyen, Tuan Duc Nguyen, Danushka Bollegala, Ryohei Sasano, and Satoshi Sekine. 2018. An Empirical Study on Fine-Grained Named Entity Recognition. In Proceedings of the 27th International Conference on Computational Linguistics, pages 711–722, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
An Empirical Study on Fine-Grained Named Entity Recognition (Mai et al., COLING 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/C18-1060.pdf
Data
FIGER