2025
pdf
bib
abs
GENDEROUS: Machine Translation and Cross-Linguistic Evaluation of a Gender-Ambiguous Dataset
Janiça Hackenbuchner
|
Joke Daems
|
Eleni Gkovedarou
Proceedings of the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Contributing to research on gender beyond the binary, this work introduces GENDEROUS, a dataset of gender-ambiguous sentences containing gender-marked occupations and adjectives, and sentences with the ambiguous or non-binary pronoun their. We cross-linguistically evaluate how machine translation (MT) systems and large language models (LLMs) translate these sentences from English into four grammatical gender languages: Greek, German, Spanish and Dutch. We show the systems’ continued default to male-gendered translations, with exceptions (particularly for Dutch). Prompting for alternatives, however, shows potential in attaining more diverse and neutral translations across all languages. An LLM-as-a-judge approach was implemented, where benchmarking against gold standards emphasises the continued need for human annotations.
2024
pdf
bib
abs
Impact of translation workflows with and without MT on textual characteristics in literary translation
Joke Daems
|
Paola Ruffo
|
Lieve Macken
Proceedings of the 1st Workshop on Creative-text Translation and Technology
The use of machine translation is increasingly being explored for the translation of literary texts, but there is still a lot of uncertainty about the optimal translation workflow in these scenarios. While overall quality is quite good, certain textual characteristics can be different in a human translated text and a text produced by means of machine translation post-editing, which has been shown to potentially have an impact on reader perceptions and experience as well. In this study, we look at textual characteristics from short story translations from B.J. Novak’s One more thing into Dutch. Twenty-three professional literary translators translated three short stories, in three different conditions: using Word, using the classic CAT tool Trados, and using a machine translation post-editing platform specifically designed for literary translation. We look at overall text characteristics (sentence length, type-token ratio, stylistic differences) to establish whether translation workflow has an impact on these features, and whether the three workflows lead to very different final translations or not.
pdf
bib
abs
Automatic detection of (potential) factors in the source text leading to gender bias in machine translation
Janiça Hackenbuchner
|
Arda Tezcan
|
Joke Daems
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2)
This research project aims to develop a comprehensive methodology to help make machine translation (MT) systems more gender-inclusive for society. The goal is the creation of a detection system, a machine learning (ML) model trained on manual annotations, that can automatically analyse source data and detect and highlight words and phrases that influence the gender bias inflection in target translations.The main research outputs will be (1) a manually annotated dataset, (2) a taxonomy, and (3) a fine-tuned model.
pdf
bib
Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies
Beatrice Savoldi
|
Janiça Hackenbuchner
|
Luisa Bentivogli
|
Joke Daems
|
Eva Vanmassenhove
|
Jasmijn Bastings
Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies
pdf
bib
abs
You Shall Know a Word’s Gender by the Company it Keeps: Comparing the Role of Context in Human Gender Assumptions with MT
Janiça Hackenbuchner
|
Joke Daems
|
Arda Tezcan
|
Aaron Maladry
Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies
In this paper, we analyse to what extent machine translation (MT) systems and humans base their gender translations and associations on role names and on stereotypicality in the absence of (generic) grammatical gender cues in language. We compare an MT system’s choice of gender for a certain word when translating from a notional gender language, English, into a grammatical gender language, German, with thegender associations of humans. We outline a comparative case study of gender translation and annotation of words in isolation, out-of-context, and words in sentence contexts. The analysis reveals patterns of gender (bias) by MT and gender associations by humans for certain (1) out-of-context words and (2) words in-context. Our findings reveal the impact of context on gender choice and translation and show that word-level analyses fall short in such studies.
pdf
bib
abs
Pilot testing gender-inclusive translations and machine translations for German quadball referee certification test takers
Joke Daems
Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies
Gender-inclusive translations are the default at the International Quadball Association, yet translators make different choices for the (timed) referee certification tests to improve readability. However, the actual impact of a strategy on readability and performance has not been tested. This pilot study explores the impact of translation strategy (masculine generic, gender-inclusive, and machine translation) on the speed, performance and perceptions of quadball referee test takers in German. It shows promise for inclusive over masculine strategies, and suggests limited usefulness of MT in this context.
2023
pdf
bib
abs
Developing User-centred Approaches to Technological Innovation in Literary Translation (DUAL-T)
Paola Ruffo
|
Joke Daems
|
Lieve Macken
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
DUAL-T is an EU-funded project which aims at involving literary translators in the testing of technology-inclusive workflows. Participants will be asked to translate three short stories using, respectively, (1) a text editor combined with online resources, (2) a Computer-Aided Translation (CAT) tool, and (3) a Machine Translation Post-editing (MTPE) tool.
pdf
bib
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies
Eva Vanmassenhove
|
Beatrice Savoldi
|
Luisa Bentivogli
|
Joke Daems
|
Janiça Hackenbuchner
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies
pdf
bib
abs
Gender-inclusive translation for a gender-inclusive sport: strategies and translator perceptions at the International Quadball Association
Joke Daems
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies
Gender-inclusive language is of key importance to the IQA, the international governing body for quadball, a mixed-gender contact sport that explicitly welcomes players of all genders. While relatively straightforward for English, the picture becomes more complicated for most of the other IQA working languages. This paper provides an overview of the strategies currently chosen by translation team leaders for different IQA languages, the factors that influenced this decision and their connection with existing research on inclusive language strategies. It further explores the awareness and attitudes of IQA translators towards those strategies and factors.
pdf
bib
abs
How adaptive is adaptive machine translation, really? A gender-neutral language use case
Aida Kostikova
|
Joke Daems
|
Todor Lazarov
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies
This study examines the effectiveness of adaptive machine translation (AMT) for gender-neutral language (GNL) use in English-German translation using the ModernMT engine. It investigates gender bias in initial output and adaptability to two distinct GNL strategies, as well as the influence of translation memory (TM) use on adaptivity. Findings indicate that despite inherent gender bias, machine translation (MT) systems show potential for adapting to GNL with appropriate exposure and training, highlighting the importance of customisation, exposure to diverse examples, and better representation of different forms for enhancing gender-fair translation strategies.
2022
pdf
bib
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
Helena Moniz
|
Lieve Macken
|
Andrew Rufener
|
Loïc Barrault
|
Marta R. Costa-jussà
|
Christophe Declercq
|
Maarit Koponen
|
Ellie Kemp
|
Spyridon Pilos
|
Mikel L. Forcada
|
Carolina Scarton
|
Joachim Van den Bogaert
|
Joke Daems
|
Arda Tezcan
|
Bram Vanroy
|
Margot Fonteyne
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
pdf
bib
abs
DeBiasByUs: Raising Awareness and Creating a Database of MT Bias
Joke Daems
|
Janiça Hackenbuchner
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
This paper presents the project initiated by the BiasByUs team resulting from the 2021 Artificially Correct Hackaton. We briefly explain our winning participation in the hackaton, tackling the challenge on ‘Database and detection of gender bi-as in A.I. translations’, we highlight the importance of gender bias in Machine Translation (MT), and describe our pro-posed solution to the challenge, the cur-rent status of the project, and our envi-sioned future collaborations and re-search.
pdf
bib
abs
Writing in a second Language with Machine translation (WiLMa)
Margot Fonteyne
|
Maribel Montero Perez
|
Joke Daems
|
Lieve Macken
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
The WiLMa project aims to assess the effects of using machine translation (MT) tools on the writing processes of second language (L2) learners of varying proficiency. Particular attention is given to individual variation in learners’ tool use.
pdf
bib
abs
GECO-MT: The Ghent Eye-tracking Corpus of Machine Translation
Toon Colman
|
Margot Fonteyne
|
Joke Daems
|
Nicolas Dirix
|
Lieve Macken
Proceedings of the Thirteenth Language Resources and Evaluation Conference
In the present paper, we describe a large corpus of eye movement data, collected during natural reading of a human translation and a machine translation of a full novel. This data set, called GECO-MT (Ghent Eye tracking Corpus of Machine Translation) expands upon an earlier corpus called GECO (Ghent Eye-tracking Corpus) by Cop et al. (2017). The eye movement data in GECO-MT will be used in future research to investigate the effect of machine translation on the reading process and the effects of various error types on reading. In this article, we describe in detail the materials and data collection procedure of GECO-MT. Extensive information on the language proficiency of our participants is given, as well as a comparison with the participants of the original GECO. We investigate the distribution of a selection of important eye movement variables and explore the possibilities for future analyses of the data. GECO-MT is freely available at 
https://www.lt3.ugent.be/resources/geco-mt.
2020
pdf
bib
abs
Assessing the Comprehensibility of Automatic Translations (ArisToCAT)
Lieve Macken
|
Margot Fonteyne
|
Arda Tezcan
|
Joke Daems
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
The ArisToCAT project aims to assess the comprehensibility of ‘raw’ (unedited) MT output for readers who can only rely on the MT output. In this project description, we summarize the main results of the project and present future work.
2019
pdf
bib
When a ‘sport’ is a person and other issues for NMT of novels
Arda Tezcan
|
Joke Daems
|
Lieve Macken
Proceedings of the Qualities of Literary Machine Translation
2015
pdf
bib
The impact of machine translation error types on post-editing effort indicators
Joke Daems
|
Sonia Vandepitte
|
Robert Hartsuker
|
Lieve Macken
Proceedings of the 4th Workshop on Post-editing Technology and Practice
2014
pdf
bib
abs
On the origin of errors: A fine-grained analysis of MT and PE errors and their relationship
Joke Daems
|
Lieve Macken
|
Sonia Vandepitte
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
In order to improve the symbiosis between machine translation (MT) system and post-editor, it is not enough to know that the output of one system is better than the output of another system. A fine-grained error analysis is needed to provide information on the type and location of errors occurring in MT and the corresponding errors occurring after post-editing (PE). This article reports on a fine-grained translation quality assessment approach which was applied to machine translated-texts and the post-edited versions of these texts, made by student post-editors. By linking each error to the corresponding source text-passage, it is possible to identify passages that were problematic in MT, but not after PE, or passages that were problematic even after PE. This method provides rich data on the origin and impact of errors, which can be used to improve post-editor training as well as machine translation systems. We present the results of a pilot experiment on the post-editing of newspaper articles and highlight the advantages of our approach.
2013
pdf
bib
Quality as the sum of its parts: a two-step approach for the identification of translation problems and translation quality assessment for HT and MT+PE
Joke Daems
|
Lieve Macken
|
Sonia Vandepitte
Proceedings of the 2nd Workshop on Post-editing Technology and Practice