Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection

Miao Ziqi; Yi Ding; Lijun Li; Jing Shao

Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection

Abstract

With the emergence of strong vision language capabilities, multimodal large language models (MLLMs) have demonstrated tremendous potential for real-world applications. However, the security vulnerabilities exhibited by the visual modality pose significant challenges to deploying such models in open-world environments.Recent studies have successfully induced harmful responses from target MLLMs by encoding harmful textual semantics directly into visual inputs. However, in these approaches, the visual modality primarily serves as a trigger for unsafe behavior, often exhibiting semantic ambiguity and lacking grounding in realistic scenarios. In this work, we define a novel setting: vision-centric jailbreak, where visual information serves as a necessary component in constructing a complete and realistic jailbreak context. Building on this setting, we propose the VisCo (Visual Contextual) Attack.VisCo fabricates contextual dialogue using four distinct vision-focused strategies, dynamically generating auxiliary images when necessary to construct a vision-centric jailbreak scenario.To maximize attack effectiveness, it incorporates automatic toxicity obfuscation and semantic refinement to produce a final attack prompt that reliably triggers harmful responses from the target black-box MLLMs. Specifically, VisCo achieves a toxicity score of 4.78 and an Attack Success Rate (ASR) of 85% on MM-SafetyBench against GPT-4o, significantly outperforming the baseline, which achieves a toxicity score of 2.48 and an ASR of 22.2%. Code: https://github.com/Dtc7w3PQ/Visco-Attack.

Anthology ID:: 2025.emnlp-main.487
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9638–9655
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.487/
DOI:
Bibkey:
Cite (ACL):: Miao Ziqi, Yi Ding, Lijun Li, and Jing Shao. 2025. Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 9638–9655, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection (Ziqi et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.487.pdf
Checklist:: 2025.emnlp-main.487.checklist.pdf

PDF Cite Search Checklist Fix data