Abstract
Entity disambiguation (ED) is the task of disambiguating named entity mentions in text to unique entries in a knowledge base. Due to its industrial relevance, as well as current progress in leveraging pre-trained language models, a multitude of ED approaches have been proposed in recent years. However, we observe a severe lack of uniformity across experimental setups in current ED work,rendering a direct comparison of approaches based solely on reported numbers impossible: Current approaches widely differ in the data set used to train, the size of the covered entity vocabulary, and the usage of additional signals such as candidate lists. To address this issue, we present ZELDA , a novel entity disambiguation benchmark that includes a unified training data set, entity vocabulary, candidate lists, as well as challenging evaluation splits covering 8 different domains. We illustrate its design and construction, and present experiments in which we train and compare current state-of-the-art approaches on our benchmark. To encourage greater direct comparability in the entity disambiguation domain, we make our benchmark publicly available to the research community.- Anthology ID:
- 2023.eacl-main.151
- Volume:
- Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Editors:
- Andreas Vlachos, Isabelle Augenstein
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2061–2072
- Language:
- URL:
- https://aclanthology.org/2023.eacl-main.151
- DOI:
- 10.18653/v1/2023.eacl-main.151
- Award:
- EACL Outstanding Paper
- Cite (ACL):
- Marcel Milich and Alan Akbik. 2023. ZELDA: A Comprehensive Benchmark for Supervised Entity Disambiguation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2061–2072, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- ZELDA: A Comprehensive Benchmark for Supervised Entity Disambiguation (Milich & Akbik, EACL 2023)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2023.eacl-main.151.pdf