PII-VisBench: Evaluating Personally Identifiable Information Safety in Vision Language Models Along a Continuum of Visibility

G. M. Shahariar; Zabir Al Nazi; Md Olid Hasan Bhuiyan; Zhouxing Shi

PII-VisBench: Evaluating Personally Identifiable Information Safety in Vision Language Models Along a Continuum of Visibility

G M Shahariar, Zabir Al Nazi, Md Olid Hasan Bhuiyan, Zhouxing Shi

Abstract

Vision Language Models (VLMs) are increasingly integrated into privacy-critical domains, yet existing evaluations of personally identifiable information (PII) leakage largely treat privacy as a static extraction task and ignore how a subject’s online presence—the volume of their data available online—influences privacy alignment. We introduce **PII-VisBench**, a novel benchmark containing 4,000 unique probes designed to evaluate VLM safety through the *continuum of online presence*. The benchmark stratifies 200 subjects into four visibility categories: *high, medium, low,* and *zero*—based on the extent and nature of their information available online. We evaluate 18 open-source VLMs (0.3B–32B) based on two key metrics: percentage of PII probing queries refused (*Refusal Rate*) and the fraction of non-refusal responses flagged for containing PII (*Conditional PII Disclosure Rate*). Across models, we observe a consistent pattern: refusals increase and PII disclosures decrease (9.10% high → 5.34% low) as subject visibility drops. We identify that models are more likely to disclose PII for high-visibility subjects, alongside substantial model-family heterogeneity and PII-type disparities. Finally, paraphrasing and jailbreak-style prompts expose attack- and model-dependent failures, motivating visibility-aware safety evaluation and training interventions.

Anthology ID:: 2026.findings-acl.501
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10294–10316
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.501/
DOI:
Bibkey:
Cite (ACL):: G M Shahariar, Zabir Al Nazi, Md Olid Hasan Bhuiyan, and Zhouxing Shi. 2026. PII-VisBench: Evaluating Personally Identifiable Information Safety in Vision Language Models Along a Continuum of Visibility. In Findings of the Association for Computational Linguistics: ACL 2026, pages 10294–10316, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: PII-VisBench: Evaluating Personally Identifiable Information Safety in Vision Language Models Along a Continuum of Visibility (Shahariar et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.501.pdf
Checklist:: 2026.findings-acl.501.checklist.pdf

PDF Cite Search Checklist Fix data