Linking Entities to Unseen Knowledge Bases with Arbitrary Schemas

Yogarshi Vyas, Miguel Ballesteros


Abstract
In entity linking, mentions of named entities in raw text are disambiguated against a knowledge base (KB). This work focuses on linking to unseen KBs that do not have training data and whose schema is unknown during training. Our approach relies on methods to flexibly convert entities with several attribute-value pairs from arbitrary KBs into flat strings, which we use in conjunction with state-of-the-art models for zero-shot linking. We further improve the generalization of our model using two regularization schemes based on shuffling of entity attributes and handling of unseen attributes. Experiments on English datasets where models are trained on the CoNLL dataset, and tested on the TAC-KBP 2010 dataset show that our models are 12% (absolute) more accurate than baseline models that simply flatten entities from the target KB. Unlike prior work, our approach also allows for seamlessly combining multiple training datasets. We test this ability by adding both a completely different dataset (Wikia), as well as increasing amount of training data from the TAC-KBP 2010 training set. Our models are more accurate across the board compared to baselines.
Anthology ID:
2021.naacl-main.65
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
834–844
Language:
URL:
https://aclanthology.org/2021.naacl-main.65
DOI:
10.18653/v1/2021.naacl-main.65
Bibkey:
Cite (ACL):
Yogarshi Vyas and Miguel Ballesteros. 2021. Linking Entities to Unseen Knowledge Bases with Arbitrary Schemas. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 834–844, Online. Association for Computational Linguistics.
Cite (Informal):
Linking Entities to Unseen Knowledge Bases with Arbitrary Schemas (Vyas & Ballesteros, NAACL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2021.naacl-main.65.pdf
Video:
 https://preview.aclanthology.org/naacl-24-ws-corrections/2021.naacl-main.65.mp4