GRADA: Graph-based Reranking against Adversarial Documents Attack

Jingjie Zheng, Aryo Pradipta Gema, Giwon Hong, Xuanli He, Pasquale Minervini, Youcheng Sun, Qiongkai Xu


Abstract
Retrieval Augmented Generation (RAG) frameworks can improve the factual accuracy of large language models (LLMs) by integrating external knowledge from retrieved documents, thereby overcoming the limitations of models’ static intrinsic knowledge. However, these systems are susceptible to adversarial attacks that manipulate the retrieval process by introducing documents that are adversarial yet semantically similar to the query. Notably, while these adversarial documents resemble the query, they exhibit weak similarity to benign documents in the retrieval set. Thus, we propose a simple yet effective **G**raph-based **R**eranking against **A**dversarial **D**ocument **A**ttacks (GRADA) framework aiming at preserving retrieval quality while significantly reducing the success of adversaries. Our study evaluates the effectiveness of our approach through experiments conducted on six LLMs: GPT-3.5-Turbo, GPT-4o, Llama3.1-8b-Instruct, Llama3.1-70b-Instruct, Qwen2.5-7b-Instruct and Qwen2.5-14b-Instruct. We use three datasets to assess performance, with results from the Natural Questions dataset demonstrating up to an 80% reduction in attack success rates while maintaining minimal loss in accuracy.
Anthology ID:
2025.emnlp-main.1132
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
22255–22277
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1132/
DOI:
Bibkey:
Cite (ACL):
Jingjie Zheng, Aryo Pradipta Gema, Giwon Hong, Xuanli He, Pasquale Minervini, Youcheng Sun, and Qiongkai Xu. 2025. GRADA: Graph-based Reranking against Adversarial Documents Attack. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 22255–22277, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
GRADA: Graph-based Reranking against Adversarial Documents Attack (Zheng et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1132.pdf
Checklist:
 2025.emnlp-main.1132.checklist.pdf