Zihe Yan
2026
EVA: Evolving Semantic Adversaries for Red-Teaming GUI Agents Against Environmental Injection Attacks
Yijie Lu | Manman Zhao | Tianjie Ju | Zihe Yan | Xinbei Ma | Yuan Guo | Daizong Ding | Gongshen Liu | Zhuosheng Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Yijie Lu | Manman Zhao | Tianjie Ju | Zihe Yan | Xinbei Ma | Yuan Guo | Daizong Ding | Gongshen Liu | Zhuosheng Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Autonomous GUI agents are inherently vulnerable to Environmental Injection Attacks (EIAs). However, existing red-teaming methods face a trade-off between requiring target-specific knowledge and incurring prohibitive computational costs. More fundamentally, a key question remains: what factors determine attack success? To answer this, we first analyze two dimensions: visual appearance (e.g., position, size, color) and semantic content. We find that semantic content dominates, while visual variations have negligible impact. Leveraging this insight, we introduce EVA, a framework that evolves payloads exclusively on the semantic dimension via a discovery-deployment pipeline. Experiments demonstrate that EVA significantly outperforms baselines, achieving 59% to 85% average Attack Success Rate (ASR) while evolving benign seeds into successful attacks within 1.18 to 1.71 iterations. This rapid convergence suggests a dense semantic attack space within the model’s latent space. Whenever an input falls into this space, the agent becomes inherently vulnerable, exposing a fundamental alignment flaw in current multimodal representations.