Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual

Xuansong Li, Stephanie Strassel, Heng Ji, Kira Griffitt, Joe Ellis


Abstract
To advance information extraction and question answering technologies toward a more realistic path, the U.S. NIST (National Institute of Standards and Technology) initiated the KBP (Knowledge Base Population) task as one of the TAC (Text Analysis Conference) evaluation tracks. It aims to encourage research in automatic information extraction of named entities from unstructured texts with the ultimate goal of integrating such information into a structured Knowledge Base. The KBP track consists of two types of evaluation: Named Entity Linking (NEL) and Slot Filling. This paper describes the linguistic resource creation efforts at the Linguistic Data Consortium (LDC) in support of Named Entity Linking evaluation of KBP, focusing on annotation methodologies, process, and features of corpora from 2009 to 2011, with a highlighted analysis of the cross-lingual NEL data. Progressing from monolingual to cross-lingual Entity Linking technologies, the 2011 cross-lingual NEL evaluation targeted multilingual capabilities. Annotation accuracy is presented in comparison with system performance, with promising results from cross-lingual entity linking systems.
Anthology ID:
L12-1118
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3098–3105
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/278_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Xuansong Li, Stephanie Strassel, Heng Ji, Kira Griffitt, and Joe Ellis. 2012. Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3098–3105, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual (Li et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/278_Paper.pdf