This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
JokeDaems
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
This study investigates the impact of different translation workflows and underlying machine translation technologies on the translation strategies used in literary translations. We compare human translation, translation within a computer-assisted translation (CAT) tool, and machine translation post-editing (MTPE), alongside neural machine translation (NMT) and large language models (LLMs). Using three short stories translated from English into Dutch, we annotated translation difficulties and strategies employed to overcome them. Our analysis reveals differences in translation solutions across modalities, highlighting the influence of technology on the final translation. The findings suggest that while MTPE tends to produce more literal translations, human translators and CAT tools exhibit greater creativity and employ more non-literal translation strategies. Additionally, LLMs reduced the number of literal translation solutions compared to traditional NMT systems. While our study provides valuable insights, it is limited by the use of only three texts and a single language pair. Further research is needed to explore these dynamics across a broader range of texts and languages, to better understand the full impact of translation workflows and technologies on literary translation.
With quality improvements in neural machine translation (NMT), scholars have argued that human translation revision and MT post-editing are becoming more alike, which would have implications for translator training. This study contributes to this growing body of work by exploring the ability of student translators (ZH-EN) to distinguish between NMT and human translation (HT) for news text and literary text and analyses how text type and student perceptions influence their subsequent revision process. We found that participants were reasonably adept at distinguishing between NMT and HT, particularly for literary texts. Participants’ revision quality was dependent on the text type as well as the perceived source of translation. The findings also highlight student translators’ limited competence in revision and post-editing, emphasizing the need to integrate NMT, revision, and post-editing into translation training programmes.
Contributing to research on gender beyond the binary, this work introduces GENDEROUS, a dataset of gender-ambiguous sentences containing gender-marked occupations and adjectives, and sentences with the ambiguous or non-binary pronoun their. We cross-linguistically evaluate how machine translation (MT) systems and large language models (LLMs) translate these sentences from English into four grammatical gender languages: Greek, German, Spanish and Dutch. We show the systems’ continued default to male-gendered translations, with exceptions (particularly for Dutch). Prompting for alternatives, however, shows potential in attaining more diverse and neutral translations across all languages. An LLM-as-a-judge approach was implemented, where benchmarking against gold standards emphasises the continued need for human annotations.
As the demand for inclusive language increases, concern has grown over the susceptibility of machine translation (MT) systems to reinforce gender stereotypes. This study investigates gender bias in two commercial MT systems, Google Translate and DeepL, focusing on the understudied English-to-Greek language pair. We address three aspects of gender bias: i) male bias, ii) occupational stereotyping, and iii) errors in anti-stereotypical translations. Additionally, we explore the potential of prompted GPT-4o as a bias mitigation tool that provides both gender-explicit and gender-neutral alternatives when necessary. To achieve this, we introduce GendEL, a manually crafted bilingual dataset of 240 gender-ambiguous and unambiguous sentences that feature stereotypical occupational nouns and adjectives. We find persistent gender bias in translations by both MT systems; while they perform well in cases where gender is explicitly defined, with DeepL outperforming both Google Translate and GPT-4o in feminine gender-unambiguous sentences, they are far from producing gender-inclusive or neutral translations when the gender is unspecified. GPT-4o shows promise, generating appropriate gendered and neutral alternatives for most ambiguous cases, though residual biases remain evident. As one of the first comprehensive studies on gender bias in English-to-Greek MT, we provide both our data and code at https://github.com/elenigkove/genderbias_EN-EL_MT.
The use of machine translation is increasingly being explored for the translation of literary texts, but there is still a lot of uncertainty about the optimal translation workflow in these scenarios. While overall quality is quite good, certain textual characteristics can be different in a human translated text and a text produced by means of machine translation post-editing, which has been shown to potentially have an impact on reader perceptions and experience as well. In this study, we look at textual characteristics from short story translations from B.J. Novak’s One more thing into Dutch. Twenty-three professional literary translators translated three short stories, in three different conditions: using Word, using the classic CAT tool Trados, and using a machine translation post-editing platform specifically designed for literary translation. We look at overall text characteristics (sentence length, type-token ratio, stylistic differences) to establish whether translation workflow has an impact on these features, and whether the three workflows lead to very different final translations or not.
This research project aims to develop a comprehensive methodology to help make machine translation (MT) systems more gender-inclusive for society. The goal is the creation of a detection system, a machine learning (ML) model trained on manual annotations, that can automatically analyse source data and detect and highlight words and phrases that influence the gender bias inflection in target translations.The main research outputs will be (1) a manually annotated dataset, (2) a taxonomy, and (3) a fine-tuned model.
In this paper, we analyse to what extent machine translation (MT) systems and humans base their gender translations and associations on role names and on stereotypicality in the absence of (generic) grammatical gender cues in language. We compare an MT system’s choice of gender for a certain word when translating from a notional gender language, English, into a grammatical gender language, German, with thegender associations of humans. We outline a comparative case study of gender translation and annotation of words in isolation, out-of-context, and words in sentence contexts. The analysis reveals patterns of gender (bias) by MT and gender associations by humans for certain (1) out-of-context words and (2) words in-context. Our findings reveal the impact of context on gender choice and translation and show that word-level analyses fall short in such studies.
Gender-inclusive translations are the default at the International Quadball Association, yet translators make different choices for the (timed) referee certification tests to improve readability. However, the actual impact of a strategy on readability and performance has not been tested. This pilot study explores the impact of translation strategy (masculine generic, gender-inclusive, and machine translation) on the speed, performance and perceptions of quadball referee test takers in German. It shows promise for inclusive over masculine strategies, and suggests limited usefulness of MT in this context.
DUAL-T is an EU-funded project which aims at involving literary translators in the testing of technology-inclusive workflows. Participants will be asked to translate three short stories using, respectively, (1) a text editor combined with online resources, (2) a Computer-Aided Translation (CAT) tool, and (3) a Machine Translation Post-editing (MTPE) tool.
Gender-inclusive language is of key importance to the IQA, the international governing body for quadball, a mixed-gender contact sport that explicitly welcomes players of all genders. While relatively straightforward for English, the picture becomes more complicated for most of the other IQA working languages. This paper provides an overview of the strategies currently chosen by translation team leaders for different IQA languages, the factors that influenced this decision and their connection with existing research on inclusive language strategies. It further explores the awareness and attitudes of IQA translators towards those strategies and factors.
This study examines the effectiveness of adaptive machine translation (AMT) for gender-neutral language (GNL) use in English-German translation using the ModernMT engine. It investigates gender bias in initial output and adaptability to two distinct GNL strategies, as well as the influence of translation memory (TM) use on adaptivity. Findings indicate that despite inherent gender bias, machine translation (MT) systems show potential for adapting to GNL with appropriate exposure and training, highlighting the importance of customisation, exposure to diverse examples, and better representation of different forms for enhancing gender-fair translation strategies.
This paper presents the project initiated by the BiasByUs team resulting from the 2021 Artificially Correct Hackaton. We briefly explain our winning participation in the hackaton, tackling the challenge on ‘Database and detection of gender bi-as in A.I. translations’, we highlight the importance of gender bias in Machine Translation (MT), and describe our pro-posed solution to the challenge, the cur-rent status of the project, and our envi-sioned future collaborations and re-search.
The WiLMa project aims to assess the effects of using machine translation (MT) tools on the writing processes of second language (L2) learners of varying proficiency. Particular attention is given to individual variation in learners’ tool use.
In the present paper, we describe a large corpus of eye movement data, collected during natural reading of a human translation and a machine translation of a full novel. This data set, called GECO-MT (Ghent Eye tracking Corpus of Machine Translation) expands upon an earlier corpus called GECO (Ghent Eye-tracking Corpus) by Cop et al. (2017). The eye movement data in GECO-MT will be used in future research to investigate the effect of machine translation on the reading process and the effects of various error types on reading. In this article, we describe in detail the materials and data collection procedure of GECO-MT. Extensive information on the language proficiency of our participants is given, as well as a comparison with the participants of the original GECO. We investigate the distribution of a selection of important eye movement variables and explore the possibilities for future analyses of the data. GECO-MT is freely available at https://www.lt3.ugent.be/resources/geco-mt.
The ArisToCAT project aims to assess the comprehensibility of ‘raw’ (unedited) MT output for readers who can only rely on the MT output. In this project description, we summarize the main results of the project and present future work.
In order to improve the symbiosis between machine translation (MT) system and post-editor, it is not enough to know that the output of one system is better than the output of another system. A fine-grained error analysis is needed to provide information on the type and location of errors occurring in MT and the corresponding errors occurring after post-editing (PE). This article reports on a fine-grained translation quality assessment approach which was applied to machine translated-texts and the post-edited versions of these texts, made by student post-editors. By linking each error to the corresponding source text-passage, it is possible to identify passages that were problematic in MT, but not after PE, or passages that were problematic even after PE. This method provides rich data on the origin and impact of errors, which can be used to improve post-editor training as well as machine translation systems. We present the results of a pilot experiment on the post-editing of newspaper articles and highlight the advantages of our approach.