GENDEROUS: Machine Translation and Cross-Linguistic Evaluation of a Gender-Ambiguous Dataset

Janiça Hackenbuchner, Joke Daems, Eleni Gkovedarou


Abstract
Contributing to research on gender beyond the binary, this work introduces GENDEROUS, a dataset of gender-ambiguous sentences containing gender-marked occupations and adjectives, and sentences with the ambiguous or non-binary pronoun their. We cross-linguistically evaluate how machine translation (MT) systems and large language models (LLMs) translate these sentences from English into four grammatical gender languages: Greek, German, Spanish and Dutch. We show the systems’ continued default to male-gendered translations, with exceptions (particularly for Dutch). Prompting for alternatives, however, shows potential in attaining more diverse and neutral translations across all languages. An LLM-as-a-judge approach was implemented, where benchmarking against gold standards emphasises the continued need for human annotations.
Anthology ID:
2025.gebnlp-1.27
Volume:
Proceedings of the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Month:
August
Year:
2025
Address:
Vienna, Austria
Editors:
Agnieszka Faleńska, Christine Basta, Marta Costa-jussà, Karolina Stańczak, Debora Nozza
Venues:
GeBNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
302–319
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.gebnlp-1.27/
DOI:
Bibkey:
Cite (ACL):
Janiça Hackenbuchner, Joke Daems, and Eleni Gkovedarou. 2025. GENDEROUS: Machine Translation and Cross-Linguistic Evaluation of a Gender-Ambiguous Dataset. In Proceedings of the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 302–319, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
GENDEROUS: Machine Translation and Cross-Linguistic Evaluation of a Gender-Ambiguous Dataset (Hackenbuchner et al., GeBNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.gebnlp-1.27.pdf