Sparsity and Noise: Where Knowledge Graph Embeddings Fall Short

Jay Pujara, Eriq Augustine, Lise Getoor


Abstract
Knowledge graph (KG) embedding techniques use structured relationships between entities to learn low-dimensional representations of entities and relations. One prominent goal of these approaches is to improve the quality of knowledge graphs by removing errors and adding missing facts. Surprisingly, most embedding techniques have been evaluated on benchmark datasets consisting of dense and reliable subsets of human-curated KGs, which tend to be fairly complete and have few errors. In this paper, we consider the problem of applying embedding techniques to KGs extracted from text, which are often incomplete and contain errors. We compare the sparsity and unreliability of different KGs and perform empirical experiments demonstrating how embedding approaches degrade as sparsity and unreliability increase.
Anthology ID:
D17-1184
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1751–1756
Language:
URL:
https://aclanthology.org/D17-1184
DOI:
10.18653/v1/D17-1184
Bibkey:
Cite (ACL):
Jay Pujara, Eriq Augustine, and Lise Getoor. 2017. Sparsity and Noise: Where Knowledge Graph Embeddings Fall Short. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1751–1756, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Sparsity and Noise: Where Knowledge Graph Embeddings Fall Short (Pujara et al., EMNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D17-1184.pdf
Code
 linqs/pujara-emnlp17
Data
FB15kWN18