Stanislas Lauly

2022

As generic machine translation (MT) quality has improved, the need for targeted benchmarks that explore fine-grained aspects of quality has increased. In particular, gender accuracy in translation can have implications in terms of output fluency, translation accuracy, and ethics. In this paper, we introduce MT-GenEval, a benchmark for evaluating gender accuracy in translation from English into eight widely-spoken languages. MT-GenEval complements existing benchmarks by providing realistic, gender-balanced, counterfactual data in eight language pairs where the gender of individuals is unambiguous in the input segment, including multi-sentence segments requiring inter-sentential gender agreement. Our data and code is publicly available under a CC BY SA 3.0 license.

2020

pdf abs
Joint Translation and Unit Conversion for End-to-end Localization
Georgiana Dinu | Prashant Mathur | Marcello Federico | Stanislas Lauly | Yaser Al-Onaizan
Proceedings of the 17th International Conference on Spoken Language Translation

A variety of natural language tasks require processing of textual data which contains a mix of natural language and formal languages such as mathematical expressions. In this paper, we take unit conversions as an example and propose a data augmentation technique which lead to models learning both translation and conversion tasks as well as how to adequately switch between them for end-to-end localization.

2018

pdf
The NYU System for the CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection
Katharina Kann | Stanislas Lauly | Kyunghyun Cho
Proceedings of the CoNLL–SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection

2017

pdf abs
Neural Machine Translation for Cross-Lingual Pronoun Prediction
Sebastien Jean | Stanislas Lauly | Orhan Firat | Kyunghyun Cho
Proceedings of the Third Workshop on Discourse in Machine Translation

In this paper we present our systems for the DiscoMT 2017 cross-lingual pronoun prediction shared task. For all four language pairs, we trained a standard attention-based neural machine translation system as well as three variants that incorporate information from the preceding source sentence. We show that our systems, which are not specifically designed for pronoun prediction and may be used to generate complete sentence translations, generally achieve competitive results on this task.

Co-authors

Raghavendra Reddy Pappagari 1