Sylvia Jaki
2026
Insights from Multilingual Gender Inclusive Language Generation Shared Task
Bharathi Raja Chakravarthi | Shunmuga Priya Muthusamy Chinnan | Paul Buitelaar | Miguel Ángel García-Cumbreras | Salud María Jiménez-Zafra | Thomas Mandl | Sylvia Jaki | Rahul Ponnusamy | Anand Kumar Madasamy | Dhanalakshmi V | Bharathi B | Premjith B | Senthil Kumar B | Sathiyaraj Thangasamy
Proceedings of the Sixth Workshop on Language Technology for Equality, Diversity, Inclusion
Bharathi Raja Chakravarthi | Shunmuga Priya Muthusamy Chinnan | Paul Buitelaar | Miguel Ángel García-Cumbreras | Salud María Jiménez-Zafra | Thomas Mandl | Sylvia Jaki | Rahul Ponnusamy | Anand Kumar Madasamy | Dhanalakshmi V | Bharathi B | Premjith B | Senthil Kumar B | Sathiyaraj Thangasamy
Proceedings of the Sixth Workshop on Language Technology for Equality, Diversity, Inclusion
We investigate the role of large language models (LLMs) in promoting gender-inclusive language by evaluating their ability to rewrite biased text and generate counterfactual narratives across multiple languages. We introduce a shared task with two subtasks: gender-inclusive rewriting and counterfactual generation. The task covers five languages English, German, Spanish, Tamil, and Kannada reflecting diverse grammatical gender systems and sociocultural contexts. We release curated word-level and sentence-level datasets to support controlled inclusive generation. A total of 50 teams registered for the shared task, and around 8 teams submitted results. Submissions are evaluated using a hybrid framework combining rubric-based automatic scoring with expert human judgment. Finally, we provide an overview of participating systems and discuss key findings and challenges observed across languages.
2025
Human- or machine-translated subtitles: Who can tell them apart?
Ekaterina Lapshinova-Koltunski | Sylvia Jaki | Maren Bolz | Merle Sauter
Proceedings of Machine Translation Summit XX: Volume 1
Ekaterina Lapshinova-Koltunski | Sylvia Jaki | Maren Bolz | Merle Sauter
Proceedings of Machine Translation Summit XX: Volume 1
This contribution investigates whether machine-translated subtitles can be easily distinguished from human-translated ones. For this, we run an experiment using two versions of German subtitles for an English television series: (1)produced manually by professional subtitlers, and (2) translated automatically with a Large Language Model (LLM), i.e., GPT4. Our participants were students of translation studies with varying experience in subtitling and the use of machine translation. We asked participants to guess if the subtitles for a selection of video clips had been translated manually or automatically. Apart from analysing whether machine-translated subtitles are distinguishable from human-translated ones, we also seek for indicators of the differences between human and machine translations. Our results show that although it is overall hard to differentiate between human and machine translations, there are some differences. Notably, the more experience the humans have with translation and subtitling, the more able they are to tell apart the two translation variants.