From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text Generation

Najrin Sultana, Md Rafi Ur Rashid, Kang Gu, Shagufta Mehnaz


Abstract
LLMs can provide substantial zero-shot performance on diverse tasks using a simple task prompt, eliminating the need for training or fine-tuning. However, when applying these models to sensitive tasks, it is crucial to thoroughly assess their robustness against adversarial inputs. In this work, we introduce Static Deceptor (StaDec) and Dynamic Deceptor (DyDec), two innovative attack frameworks designed to systematically generate dynamic and adaptive adversarial examples by leveraging the understanding of the LLMs. We produce subtle and natural-looking adversarial inputs that preserve semantic similarity to the original text while effectively deceiving the target LLM. By utilizing an automated, LLM-driven pipeline, we eliminate the dependence on external heuristics. Our attacks evolve with the advancements in LLMs, while demonstrating a strong transferability across models unknown to the attacker. Overall, this work provides a systematic approach for self-assessing the robustness of the LLMs. We release our code and data at https://github.com/Shukti042/AdversarialExample.
Anthology ID:
2025.findings-emnlp.1244
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
22842–22859
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1244/
DOI:
10.18653/v1/2025.findings-emnlp.1244
Bibkey:
Cite (ACL):
Najrin Sultana, Md Rafi Ur Rashid, Kang Gu, and Shagufta Mehnaz. 2025. From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text Generation. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 22842–22859, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text Generation (Sultana et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1244.pdf
Checklist:
 2025.findings-emnlp.1244.checklist.pdf