Character-Aware Neural Networks for Arabic Named Entity Recognition for Social Media

Mourad Gridach


Abstract
Named Entity Recognition (NER) is the task of classifying or labelling atomic elements in the text into categories such as Person, Location or Organisation. For Arabic language, recognizing named entities is a challenging task because of the complexity and the unique characteristics of this language. In addition, most of the previous work focuses on Modern Standard Arabic (MSA), however, recognizing named entities in social media is becoming more interesting these days. Dialectal Arabic (DA) and MSA are both used in social media, which is deemed as another challenging task. Most state-of-the-art Arabic NER systems count heavily on handcrafted engineering features and lexicons which is time consuming. In this paper, we introduce a novel neural network architecture which benefits both from character- and word-level representations automatically, by using combination of bidirectional LSTM and Conditional Random Field (CRF), eliminating the need for most feature engineering. Moreover, our model relies on unsupervised word representations learned from unannotated corpora. Experimental results demonstrate that our model achieves state-of-the-art performance on publicly available benchmark for Arabic NER for social media and surpassing the previous system by a large margin.
Anthology ID:
W16-3703
Volume:
Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Dekai Wu, Pushpak Bhattacharyya
Venue:
WSSANLP
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
23–32
Language:
URL:
https://aclanthology.org/W16-3703
DOI:
Bibkey:
Cite (ACL):
Mourad Gridach. 2016. Character-Aware Neural Networks for Arabic Named Entity Recognition for Social Media. In Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016), pages 23–32, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Character-Aware Neural Networks for Arabic Named Entity Recognition for Social Media (Gridach, WSSANLP 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/W16-3703.pdf
Data
SST