Rethinking Post-Unlearning Behavior of Large Vision-Language Models

Minsung Kim; Nakyeong Yang; Kyomin Jung

Rethinking Post-Unlearning Behavior of Large Vision-Language Models

Abstract

Large Vision-Language Models (LVLMs) can recognize individuals in images and disclose sensitive personal information about them, raising critical privacy concerns. Machine unlearning aims to remove such knowledge from the model. However, existing methods rarely prescribe what the model should output in place of the forgotten content, leading to Unlearning Aftermaths: degenerate, hallucinated, or excessively refused responses. We argue that, especially for generative LVLMs, it is crucial to consider the quality and informativeness of post-unlearning responses rather than relying solely on naive suppression. To address this, we introduce a new unlearning task for LVLMs that requires models to provide privacy-preserving yet informative and visually grounded responses. We also propose PUBG, a novel unlearning method that explicitly guides post-unlearning behavior toward a desirable output distribution. Experiments show that, while existing methods suffer from Unlearning Aftermaths despite successfully preventing privacy violations, PUBG effectively mitigates these issues, generating visually grounded and informative responses without privacy leakage for forgotten targets.

Anthology ID:: 2026.findings-acl.854
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 17277–17287
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.854/
DOI:
Bibkey:
Cite (ACL):: Minsung Kim, Nakyeong Yang, and Kyomin Jung. 2026. Rethinking Post-Unlearning Behavior of Large Vision-Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 17277–17287, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Rethinking Post-Unlearning Behavior of Large Vision-Language Models (Kim et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.854.pdf
Checklist:: 2026.findings-acl.854.checklist.pdf

PDF Cite Search Checklist Fix data