Abstract
Bilinear models such as DistMult and ComplEx are effective methods for knowledge graph (KG) completion. However, they require large batch sizes, which becomes a performance bottleneck when training on large scale datasets due to memory constraints. In this paper we use occurrences of entity-relation pairs in the dataset to construct a joint learning model and to increase the quality of sampled negatives during training. We show on three standard datasets that when these two techniques are combined, they give a significant improvement in performance, especially when the batch size and the number of generated negative examples are low relative to the size of the dataset. We then apply our techniques to a dataset containing 2 million entities and demonstrate that our model outperforms the baseline by 2.8% absolute on hits@1.- Anthology ID:
- D19-1368
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3591–3596
- Language:
- URL:
- https://aclanthology.org/D19-1368
- DOI:
- 10.18653/v1/D19-1368
- Cite (ACL):
- Esma Balkir, Masha Naslidnyk, Dave Palfrey, and Arpit Mittal. 2019. Using Pairwise Occurrence Information to Improve Knowledge Graph Completion on Large-Scale Datasets. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3591–3596, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Using Pairwise Occurrence Information to Improve Knowledge Graph Completion on Large-Scale Datasets (Balkir et al., EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/naacl24-info/D19-1368.pdf
- Data
- FB15k, FB15k-237