Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals

Phillip Howard; Kathleen C. Fraser; Anahita Bhiwandiwalla; Svetlana Kiritchenko

Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals

Phillip Howard, Kathleen C. Fraser, Anahita Bhiwandiwalla, Svetlana Kiritchenko

Abstract

With the advent of Large Language Models (LLMs) possessing increasingly impressive capabilities, a number of Large Vision-Language Models (LVLMs) have been proposed to augment LLMs with visual inputs. Such models condition generated text on both an input image and a text prompt, enabling a variety of use cases such as visual question answering and multimodal chat. While prior studies have examined the social biases contained in text generated by LLMs, this topic has been relatively unexplored in LVLMs. Examining social biases in LVLMs is particularly challenging due to the confounding contributions of bias induced by information contained across the text and visual modalities. To address this challenging problem, we conduct a large-scale study of text generated by different LVLMs under counterfactual changes to input images, producing over 57 million responses from popular models. Our multi-dimensional bias evaluation framework reveals that social attributes such as perceived race, gender, and physical characteristics depicted in images can significantly influence the generation of toxic content, competency-associated words, harmful stereotypes, and numerical ratings of individuals.

Anthology ID:: 2025.naacl-long.305
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5946–5991
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.305/
DOI:
Bibkey:
Cite (ACL):: Phillip Howard, Kathleen C. Fraser, Anahita Bhiwandiwalla, and Svetlana Kiritchenko. 2025. Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 5946–5991, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals (Howard et al., NAACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.305.pdf

PDF Cite Search Fix data