Neural Cross-Lingual Named Entity Recognition with Minimal Resources
Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, Jaime Carbonell
Abstract
For languages with no annotated resources, unsupervised transfer of natural language processing models such as named-entity recognition (NER) from resource-rich languages would be an appealing capability. However, differences in words and word order across languages make it a challenging problem. To improve mapping of lexical items across languages, we propose a method that finds translations based on bilingual word embeddings. To improve robustness to word order differences, we propose to use self-attention, which allows for a degree of flexibility with respect to word order. We demonstrate that these methods achieve state-of-the-art or competitive NER performance on commonly tested languages under a cross-lingual setting, with much lower resource requirements than past approaches. We also evaluate the challenges of applying these methods to Uyghur, a low-resource language.- Anthology ID:
- D18-1034
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 369–379
- Language:
- URL:
- https://aclanthology.org/D18-1034
- DOI:
- 10.18653/v1/D18-1034
- Cite (ACL):
- Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, and Jaime Carbonell. 2018. Neural Cross-Lingual Named Entity Recognition with Minimal Resources. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 369–379, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Neural Cross-Lingual Named Entity Recognition with Minimal Resources (Xie et al., EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/D18-1034.pdf
- Code
- thespectrewithin/cross-lingual_NER
- Data
- CoNLL 2002