When Background Matters: Breaking Medical Vision Language Models by Transferable Attack

Akash Ghosh, Subhadip Baidya, Sriparna Saha, Xiuying Chen


Abstract
Vision–Language Models (VLMs) are increasingly used in clinical diagnostics, yet their robustness to adversarial attacks remains largely unexplored, posing serious risks. Existing medical attacks focus on secondary objectives such as model stealing or adversarial fine-tuning, while transferable attacks from natural images introduce visible distortions that clinicians can easily detect. To address this, we propose MedFocusLeak, a highly transferable black-box multimodal attack that induces incorrect yet clinically plausible diagnoses while keeping perturbations imperceptible. The method injects coordinated perturbations into non-diagnostic background regions and employs an attention-distraction mechanism to shift the model’s focus away from pathological areas. Extensive evaluations across six medical imaging modalities show that MedFocusLeak achieves state-of-the-art performance, generating misleading yet realistic diagnostic outputs across diverse VLMs. We further introduce a unified evaluation framework with novel metrics that jointly capture attack success and image fidelity, revealing a critical weakness in the reasoning capabilities of modern clinical VLMs.
Anthology ID:
2026.acl-long.1768
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
38143–38170
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1768/
DOI:
Bibkey:
Cite (ACL):
Akash Ghosh, Subhadip Baidya, Sriparna Saha, and Xiuying Chen. 2026. When Background Matters: Breaking Medical Vision Language Models by Transferable Attack. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 38143–38170, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
When Background Matters: Breaking Medical Vision Language Models by Transferable Attack (Ghosh et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1768.pdf
Checklist:
 2026.acl-long.1768.checklist.pdf