Sam Johnson

2025

pdf bib abs
The Dangers of Indirect Prompt Injection Attacks on LLM-based Autonomous Web Navigation Agents: A Demonstration
Sam Johnson | Viet Pham | Thai Le
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

This work demonstrates that LLM-based web browsing AI agents offer powerful automation capabilities but are vulnerable to Indirect Prompt Injection (IPI) attacks. We show that adversaries can embed universal adversarial triggers in webpage HTML to hijack agents that utilize the parsed-HTML accessibility tree, causing unintended or malicious actions. Using the Greedy Coordinate Gradient (GCG) algorithm and a Browser Gym agent powered by Llama-3.1, this work demonstrates high success rates across real websites in both targeted and general attacks, including login credential exfiltration and forced advertisement clicks. Our empirical results highlight critical security risks and the need for stronger defenses as LLM-driven autonomous web agents become more widely adopted. The system software is released under the MIT License at https://github.com/sej2020/manipulating-web-agents, with an accompanying publicly available demo website and video.

Co-authors

Thai Le 1
Viet Pham 1

Venues

emnlp1

Fix author