When Seeing Overrides Knowing: Disentangling Knowledge Conflicts in Vision-Language Models

Francesco Ortu; Zhijing Jin; Diego Doimo; Alberto Cazzaniga

When Seeing Overrides Knowing: Disentangling Knowledge Conflicts in Vision-Language Models

Francesco Ortu, Zhijing Jin, Diego Doimo, Alberto Cazzaniga

Abstract

Vision-language models (VLMs) increasingly combine visual and textual information to perform complex tasks. However, conflicts between their internal knowledge and external visual input can lead to hallucinations and unreliable predictions. In this work, we investigate the mechanisms that VLMs use to resolve cross-modal conflicts by introducing WHOOPS-AHA!, a dataset of multimodal counterfactual queries that deliberately contradict internal commonsense knowledge. Through logit inspection, we identify a small set of attention heads that mediate this conflict. By intervening in these heads, we can steer the model towards its internal parametric knowledge or the visual information. Our results show that attention patterns on these heads effectively locate image regions that influence visual overrides, providing a more precise attribution compared to gradient-based methods.

Anthology ID:: 2026.acl-long.642
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14109–14130
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.642/
DOI:
Bibkey:
Cite (ACL):: Francesco Ortu, Zhijing Jin, Diego Doimo, and Alberto Cazzaniga. 2026. When Seeing Overrides Knowing: Disentangling Knowledge Conflicts in Vision-Language Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14109–14130, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: When Seeing Overrides Knowing: Disentangling Knowledge Conflicts in Vision-Language Models (Ortu et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.642.pdf
Checklist:: 2026.acl-long.642.checklist.pdf

PDF Cite Search Checklist Fix data