ConGA: Guidelines for Contextual Gender Annotation. a Framework for Annotating Gender in Machine Translation

Argentina Anna Rescigno, Eva Vanmassenhove, Johanna Monti


Abstract
Handling gender across languages remains a persistent challenge for Machine Translation (MT) and Large Language Models (LLMs), especially when translating from gender-neutral languages into morphologically gendered ones, such as English to Italian. English largely omits grammatical gender, while Italian requires explicit agreement across multiple grammatical categories. This asymmetry often leads MT systems to default to masculine forms, reinforcing bias and reducing translation accuracy. To address this issue, we present the Contextual Gender Annotation (ConGA) framework, a linguistically grounded set of guidelines for word-level gender annotation. The scheme distinguishes between semantic gender in English through three tags, Masculine (M), Feminine (F), and Ambiguous (A), and grammatical gender realisation in Italian (Masculine (M), Feminine (F)), combined with entity-level identifiers for cross-sentence tracking. We apply ConGA to the gENder-IT dataset, creating a gold-standard resource for evaluating gender bias in translation. Our results reveal systematic masculine overuse and inconsistent feminine realisation, highlighting persistent limitations of current MT systems. By combining fine-grained linguistic annotation with quantitative evaluation, this work offers both a methodology and a benchmark for building more gender-aware and multilingual NLP systems.
Anthology ID:
2026.lrec-main.320
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
4048–4057
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.320/
DOI:
Bibkey:
Cite (ACL):
Argentina Anna Rescigno, Eva Vanmassenhove, and Johanna Monti. 2026. ConGA: Guidelines for Contextual Gender Annotation. a Framework for Annotating Gender in Machine Translation. International Conference on Language Resources and Evaluation, main:4048–4057.
Cite (Informal):
ConGA: Guidelines for Contextual Gender Annotation. a Framework for Annotating Gender in Machine Translation (Rescigno et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.320.pdf