Once Correct, Still Wrong: Counterfactual Hallucination in Multilingual Vision-Language Models
Basel Mousi, Fahim Dalvi, Shammur Absar Chowdhury, Firoj Alam, Nadir Durrani
Abstract
Vision–language models (VLMs) can achieve high accuracy while still accepting **culturally plausible but visually incorrect** interpretations. Existing hallucination benchmarks rarely test this failure mode, particularly outside Western contexts and English. We introduce **M2CQA**, a culturally grounded multimodal benchmark built from images spanning 17 MENA countries, paired with contrastive true and counterfactual statements in English, Arabic, and its dialects. To isolate hallucination beyond raw accuracy, we propose the **CounterFactual Hallucination Rate (CFHR)**, which measures counterfactual acceptance conditioned on correctly answering the true statement. Evaluating state-of-the-art VLMs under multiple prompting strategies, we find that CFHR rises sharply in Arabic, especially in dialects, even when true-statement accuracy remains high.Moreover, reasoning-first prompting consistently increases counterfactual hallucination, while answering before justifying improves robustness. We make the dataset publicly available for the community (https://huggingface.co/datasets/QCRI/M2CQA)).- Anthology ID:
- 2026.findings-acl.234
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4763–4788
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.234/
- DOI:
- Cite (ACL):
- Basel Mousi, Fahim Dalvi, Shammur Absar Chowdhury, Firoj Alam, and Nadir Durrani. 2026. Once Correct, Still Wrong: Counterfactual Hallucination in Multilingual Vision-Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 4763–4788, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Once Correct, Still Wrong: Counterfactual Hallucination in Multilingual Vision-Language Models (Mousi et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.234.pdf