Treble Counterfactual VLMs: A Causal Approach to Hallucination

Li Li; Jiashu Qu; Linxin Song; Yuxiao Zhou; Yuehan Qin; Tiankai Yang; Yue Zhao

doi:10.18653/v1/2025.findings-emnlp.1000

Treble Counterfactual VLMs: A Causal Approach to Hallucination

Li Li, Jiashu Qu, Linxin Song, Yuxiao Zhou, Yuehan Qin, Tiankai Yang, Yue Zhao

Abstract

Vision-Language Models (VLMs) excel at tasks such as image captioning and visual question answering but frequently produce hallucinated outputs that deviate from the actual visual input or prompt. While prior work links hallucination to biases in data or representation, their causal origins remain unclear. We propose a causal framework to analyze and mitigate hallucination in VLMs. Our key hypothesis is that hallucinations arise from unintended direct influences of the vision or text modality that bypass the intended multi-modal fusion. To examine this, we construct a causal graph of the VLM and use counterfactual analysis to estimate the Natural Direct Effect (NDE) of each modality and their interaction. By systematically identifying and suppressing these direct effects, we encourage outputs that are more faithfully grounded in true cross-modal reasoning. Our approach consists of three steps: (1) designing structural causal graphs to distinguish correct fusion pathways from spurious modality shortcuts, (2) estimating modality-specific and cross-modal NDE using perturbed image representations, hallucinated text embeddings, and degraded visual inputs, and (3) implementing a test-time intervention module to dynamically adjust the model’s dependence on each modality. Experimental results demonstrate that our method significantly reduces hallucination while preserving task performance, providing a robust and interpretable framework for improving VLM reliability.

Anthology ID:: 2025.findings-emnlp.1000
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 18423–18434
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1000/
DOI:: 10.18653/v1/2025.findings-emnlp.1000
Bibkey:
Cite (ACL):: Li Li, Jiashu Qu, Linxin Song, Yuxiao Zhou, Yuehan Qin, Tiankai Yang, and Yue Zhao. 2025. Treble Counterfactual VLMs: A Causal Approach to Hallucination. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 18423–18434, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Treble Counterfactual VLMs: A Causal Approach to Hallucination (Li et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1000.pdf
Checklist:: 2025.findings-emnlp.1000.checklist.pdf

PDF Cite Search Checklist Fix data