Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models

Muhammed Saeed; Shaina Raza; Ashmal Vayani; Muhammad Abdul-Mageed; Ali Emami; Shady Shehata

doi:10.18653/v1/2025.findings-emnlp.1343

Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models

Muhammed Saeed, Shaina Raza, Ashmal Vayani, Muhammad Abdul-Mageed, Ali Emami, Shady Shehata

Abstract

Research on bias in Text-to-Image (T2I) models has primarily focused on demographic representation and stereotypical attributes, overlooking a fundamental question: how does grammatical gender influence visual representation across languages? We introduce a cross-linguistic benchmark examining words where grammatical gender contradicts stereotypical gender associations (e.g., “une sentinelle” - grammatically feminine in French but referring to the stereotypically masculine concept “guard”). Our dataset spans five gendered languages (French, Spanish, German, Italian, Russian) and two gender-neutral control languages (English, Chinese), comprising 800 unique prompts that generated 28,800 images across three state-of-the-art T2I models. Our analysis reveals that grammatical gender dramatically influences image generation: masculine grammatical markers increase male representation to 73% on average (compared to 22% with gender-neutral English), while feminine grammatical markers increase female representation to 38% (compared to 28% in English). These effects vary systematically by language resource availability and model architecture, with high-resource languages showing stronger effects. Our findings establish that language structure itself, not just content, shapes AI-generated visual outputs, introducing a new dimension for understanding bias and fairness in multilingual, multimodal systems.

Anthology ID:: 2025.findings-emnlp.1343
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 24673–24695
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1343/
DOI:: 10.18653/v1/2025.findings-emnlp.1343
Bibkey:
Cite (ACL):: Muhammed Saeed, Shaina Raza, Ashmal Vayani, Muhammad Abdul-Mageed, Ali Emami, and Shady Shehata. 2025. Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 24673–24695, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models (Saeed et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1343.pdf
Checklist:: 2025.findings-emnlp.1343.checklist.pdf

PDF Cite Search Checklist Fix data