Assumed Identities: Quantifying Gender Bias in Machine Translation of Gender-Ambiguous Occupational Terms

Orfeas Menis Mastromichalakis; Giorgos Filandrianos; Maria Symeonaki; Giorgos Stamou

Assumed Identities: Quantifying Gender Bias in Machine Translation of Gender-Ambiguous Occupational Terms

Orfeas Menis Mastromichalakis, Giorgos Filandrianos, Maria Symeonaki, Giorgos Stamou

Abstract

Machine Translation (MT) systems frequently encounter gender-ambiguous occupational terms, where they must assign gender without explicit contextual cues. While individual translations in such cases may not be inherently biased, systematic patterns—such as consistently translating certain professions with specific genders—can emerge, reflecting and perpetuating societal stereotypes. This ambiguity challenges traditional instance-level single-answer evaluation approaches, as no single gold standard translation exists. To address this, we introduce GRAPE, a probability-based metric designed to evaluate gender bias by analyzing aggregated model responses. Alongside this, we present GAMBIT, a benchmarking dataset in English with gender-ambiguous occupational terms. Using GRAPE, we evaluate several MT systems and examine whether their gendered translations in Greek and French align with or diverge from societal stereotypes, real-world occupational gender distributions, and normative standards.

Anthology ID:: 2025.emnlp-main.1640
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 32221–32237
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1640/
DOI:
Bibkey:
Cite (ACL):: Orfeas Menis Mastromichalakis, Giorgos Filandrianos, Maria Symeonaki, and Giorgos Stamou. 2025. Assumed Identities: Quantifying Gender Bias in Machine Translation of Gender-Ambiguous Occupational Terms. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 32221–32237, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Assumed Identities: Quantifying Gender Bias in Machine Translation of Gender-Ambiguous Occupational Terms (Menis Mastromichalakis et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1640.pdf
Checklist:: 2025.emnlp-main.1640.checklist.pdf

PDF Cite Search Checklist Fix data