Low-Resource Named Entity Recognition with Cross-lingual, Character-Level Neural Conditional Random Fields

Ryan Cotterell, Kevin Duh


Abstract
Low-resource named entity recognition is still an open problem in NLP. Most state-of-the-art systems require tens of thousands of annotated sentences in order to obtain high performance. However, for most of the world’s languages it is unfeasible to obtain such annotation. In this paper, we present a transfer learning scheme, whereby we train character-level neural CRFs to predict named entities for both high-resource languages and low-resource languages jointly. Learning character representations for multiple related languages allows knowledge transfer from the high-resource languages to the low-resource ones, improving F1 by up to 9.8 points.
Anthology ID:
I17-2016
Volume:
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Month:
November
Year:
2017
Address:
Taipei, Taiwan
Venue:
IJCNLP
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
91–96
Language:
URL:
https://aclanthology.org/I17-2016
DOI:
Bibkey:
Cite (ACL):
Ryan Cotterell and Kevin Duh. 2017. Low-Resource Named Entity Recognition with Cross-lingual, Character-Level Neural Conditional Random Fields. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 91–96, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):
Low-Resource Named Entity Recognition with Cross-lingual, Character-Level Neural Conditional Random Fields (Cotterell & Duh, IJCNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/I17-2016.pdf