Clara Emmanuel


2024

pdf
Generating Gender Alternatives in Machine Translation
Sarthak Garg | Mozhdeh Gheini | Clara Emmanuel | Tatiana Likhomanenko | Qin Gao | Matthias Paulik
Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term “the nurse”) into the gendered form that is most prevalent in the systems’ training data (e.g., “enfermera”, the Spanish term for a female nurse). This often reflects and perpetuates harmful stereotypes present in society. With MT user interfaces in mind that allow for resolving gender ambiguity in a frictionless manner, we study the problem of generating all grammatically correct gendered translation alternatives. We open source train and test datasets for five language pairs and establish benchmarks for this task. Our key technical contribution is a novel semi-supervised solution for generating alternatives that integrates seamlessly with standard MT models and maintains high performance without requiring additional components or increasing inference overhead.

2022

pdf
Findings of the IWSLT 2022 Evaluation Campaign
Antonios Anastasopoulos | Loïc Barrault | Luisa Bentivogli | Marcely Zanon Boito | Ondřej Bojar | Roldano Cattoni | Anna Currey | Georgiana Dinu | Kevin Duh | Maha Elbayad | Clara Emmanuel | Yannick Estève | Marcello Federico | Christian Federmann | Souhir Gahbiche | Hongyu Gong | Roman Grundkiewicz | Barry Haddow | Benjamin Hsu | Dávid Javorský | Vĕra Kloudová | Surafel Lakew | Xutai Ma | Prashant Mathur | Paul McNamee | Kenton Murray | Maria Nǎdejde | Satoshi Nakamura | Matteo Negri | Jan Niehues | Xing Niu | John Ortega | Juan Pino | Elizabeth Salesky | Jiatong Shi | Matthias Sperber | Sebastian Stüker | Katsuhito Sudoh | Marco Turchi | Yogesh Virkar | Alexander Waibel | Changhan Wang | Shinji Watanabe
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation. A total of 27 teams participated in at least one of the shared tasks. This paper details, for each shared task, the purpose of the task, the data that were released, the evaluation metrics that were applied, the submissions that were received and the results that were achieved.