Abstract
We evaluate gender biases in multilingual multimodal image and text models in two settings: text-to-image retrieval and text-to-image generation, to show that even seemingly gender-neutral traits generate biased results. We evaluate our framework in the context of people from India, working with two languages: English and Hindi. We work with frameworks built around mCLIP-based models to ensure a thorough evaluation of recent state-of-the-art models in the multilingual setting due to their potential for widespread applications. We analyze the results across 50 traits for retrieval and 8 traits for generation, showing that current multilingual multimodal models are biased towards men for most traits, and this problem is further exacerbated for lower-resource languages like Hindi. We further discuss potential reasons behind this observation, particularly stemming from the bias introduced by the pretraining datasets.- Anthology ID:
- 2024.gebnlp-1.21
- Volume:
- Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Agnieszka Faleńska, Christine Basta, Marta Costa-jussà, Seraphina Goldfarb-Tarrant, Debora Nozza
- Venues:
- GeBNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 338–350
- Language:
- URL:
- https://aclanthology.org/2024.gebnlp-1.21
- DOI:
- 10.18653/v1/2024.gebnlp-1.21
- Cite (ACL):
- Kshitish Ghate, Arjun Choudhry, and Vanya Bannihatti Kumar. 2024. Evaluating Gender Bias in Multilingual Multimodal AI Models: Insights from an Indian Context. In Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 338–350, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- Evaluating Gender Bias in Multilingual Multimodal AI Models: Insights from an Indian Context (Ghate et al., GeBNLP-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2024.gebnlp-1.21.pdf