Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods

Peru Bhardwaj, John Kelleher, Luca Costabello, Declan O’Sullivan


Abstract
Despite the widespread use of Knowledge Graph Embeddings (KGE), little is known about the security vulnerabilities that might disrupt their intended behaviour. We study data poisoning attacks against KGE models for link prediction. These attacks craft adversarial additions or deletions at training time to cause model failure at test time. To select adversarial deletions, we propose to use the model-agnostic instance attribution methods from Interpretable Machine Learning, which identify the training instances that are most influential to a neural model’s predictions on test instances. We use these influential triples as adversarial deletions. We further propose a heuristic method to replace one of the two entities in each influential triple to generate adversarial additions. Our experiments show that the proposed strategies outperform the state-of-art data poisoning attacks on KGE models and improve the MRR degradation due to the attacks by up to 62% over the baselines.
Anthology ID:
2021.emnlp-main.648
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8225–8239
Language:
URL:
https://aclanthology.org/2021.emnlp-main.648
DOI:
10.18653/v1/2021.emnlp-main.648
Bibkey:
Cite (ACL):
Peru Bhardwaj, John Kelleher, Luca Costabello, and Declan O’Sullivan. 2021. Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8225–8239, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods (Bhardwaj et al., EMNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/2021.emnlp-main.648.pdf
Video:
 https://preview.aclanthology.org/improve-issue-templates/2021.emnlp-main.648.mp4
Code
 perubhardwaj/attributionattack