Abstract
In recent years, joint Vision-Language (VL) models have increased in popularity and capability. Very few studies have attempted to investigate bias in VL models, even though it is a well-known issue in both individual modalities. This paper presents the first multi-dimensional analysis of bias in English VL models, focusing on gender, ethnicity, and age as dimensions. When subjects are input as images, pre-trained VL models complete a neutral template with a hurtful word 5% of the time, with higher percentages for female and young subjects. Bias presence in downstream models has been tested on Visual Question Answering. We developed a novel bias metric called the Vision-Language Association Test based on questions designed to elicit biased associations between stereotypical concepts and targets. Our findings demonstrate that pre-trained VL models contain biases that are perpetuated in downstream tasks.- Anthology ID:
- 2023.findings-acl.403
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6445–6455
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.403
- DOI:
- 10.18653/v1/2023.findings-acl.403
- Cite (ACL):
- Gabriele Ruggeri and Debora Nozza. 2023. A Multi-dimensional study on Bias in Vision-Language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 6445–6455, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- A Multi-dimensional study on Bias in Vision-Language models (Ruggeri & Nozza, Findings 2023)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2023.findings-acl.403.pdf