Abstract
Despite their impressive scale, knowledge bases (KBs), such as Wikidata, still contain significant gaps. Language models (LMs) have been proposed as a source for filling these gaps. However, prior works have focused on prominent entities with rich coverage by LMs, neglecting the crucial case of long-tail entities. In this paper, we present a novel method for LM-based-KB completion that is specifically geared for facts about long-tail entities. The method leverages two different LMs in two stages: for candidate retrieval and for candidate verification and disambiguation. To evaluate our method and various baselines, we introduce a novel dataset, called MALT, rooted in Wikidata. Our method outperforms all baselines in F1, with major gains especially in recall.- Anthology ID:
- 2023.matching-1.8
- Volume:
- Proceedings of the First Workshop on Matching From Unstructured and Structured Data (MATCHING 2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, ON, Canada
- Editors:
- Estevam Hruschka, Tom Mitchell, Sajjadur Rahman, Dunja Mladenić, Marko Grobelnik
- Venue:
- MATCHING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 99–108
- Language:
- URL:
- https://aclanthology.org/2023.matching-1.8
- DOI:
- 10.18653/v1/2023.matching-1.8
- Cite (ACL):
- Lihu Chen, Simon Razniewski, and Gerhard Weikum. 2023. Knowledge Base Completion for Long-Tail Entities. In Proceedings of the First Workshop on Matching From Unstructured and Structured Data (MATCHING 2023), pages 99–108, Toronto, ON, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Knowledge Base Completion for Long-Tail Entities (Chen et al., MATCHING 2023)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2023.matching-1.8.pdf