Illinois Cross-Lingual Wikifier: Grounding Entities in Many Languages to the English Wikipedia

Chen-Tse Tsai, Dan Roth


Abstract
We release a cross-lingual wikification system for all languages in Wikipedia. Given a piece of text in any supported language, the system identifies names of people, locations, organizations, and grounds these names to the corresponding English Wikipedia entries. The system is based on two components: a cross-lingual named entity recognition (NER) model and a cross-lingual mention grounding model. The cross-lingual NER model is a language-independent model which can extract named entity mentions in the text of any language in Wikipedia. The extracted mentions are then grounded to the English Wikipedia using the cross-lingual mention grounding model. The only resources required to train the proposed system are the multilingual Wikipedia dump and existing training data for English NER. The system is online at http://cogcomp.cs.illinois.edu/page/demo_view/xl_wikifier
Anthology ID:
C16-2031
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations
Month:
December
Year:
2016
Address:
Osaka, Japan
Editor:
Hideo Watanabe
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
146–150
Language:
URL:
https://aclanthology.org/C16-2031
DOI:
Bibkey:
Cite (ACL):
Chen-Tse Tsai and Dan Roth. 2016. Illinois Cross-Lingual Wikifier: Grounding Entities in Many Languages to the English Wikipedia. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations, pages 146–150, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Illinois Cross-Lingual Wikifier: Grounding Entities in Many Languages to the English Wikipedia (Tsai & Roth, COLING 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/C16-2031.pdf