Lichao Zhu


2022

pdf bib
Flux d’informations dans les systèmes encodeur-décodeur. Application à l’explication des biais de genre dans les systèmes de traduction automatique. (Information flow in encoder-decoder systems applied to the explanation of gender bias in machine translation systems)
Lichao Zhu | Guillaume Wisniewski | Nicolas Ballier | François Yvon
Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Atelier TAL et Humanités Numériques (TAL-HN)

Ce travail présente deux séries d’expériences visant à identifier les flux d’information dans les systèmes de traduction neuronaux. La première série s’appuie sur une comparaison des décisions d’un modèle de langue et d’un modèle de traduction pour mettre en évidence le flux d’information provenant de la source. La seconde série met en évidence l’impact de ces flux sur l’apprentissage du système dans le cas particulier du transfert de l’information de genre.

pdf
Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System
Guillaume Wisniewski | Lichao Zhu | Nicolas Ballier | François Yvon
Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

Multiple studies have shown that existing NMT systems demonstrate some kind of “gender bias”. As a result, MT output appears to err more often for feminine forms and to amplify social gender misrepresentations, which is potentially harmful to users and practioners of these technologies. This paper continues this line of investigations and reports results obtained with a new test set in strictly controlled conditions. This setting allows us to better understand the multiple inner mechanisms that are causing these biases, which include the linguistic expressions of gender, the unbalanced distribution of masculine and feminine forms in the language, the modelling of morphological variation and the training process dynamics. To counterbalance these effects, we formulate several proposals and notably show that modifying the training loss can effectively mitigate such biases.

pdf
The SPECTRANS System Description for the WMT22 Biomedical Task
Nicolas Ballier | Jean-baptiste Yunès | Guillaume Wisniewski | Lichao Zhu | Maria Zimina
Proceedings of the Seventh Conference on Machine Translation (WMT)

This paper describes the SPECTRANS submission for the WMT 2022 biomedical shared task. We present the results of our experiments using the training corpora and the JoeyNMT (Kreutzer et al., 2019) and SYSTRAN Pure Neural Server/ Advanced Model Studio toolkits for the language directions English to French and French to English. We compare the pre- dictions of the different toolkits. We also use JoeyNMT to fine-tune the model with a selection of texts from WMT, Khresmoi and UFAL data sets. We report our results and assess the respective merits of the different translated texts.

2021

pdf
Screening Gender Transfer in Neural Machine Translation
Guillaume Wisniewski | Lichao Zhu | Nicolas Bailler | François Yvon
Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

This paper aims at identifying the information flow in state-of-the-art machine translation systems, taking as example the transfer of gender when translating from French into English. Using a controlled set of examples, we experiment several ways to investigate how gender information circulates in a encoder-decoder architecture considering both probing techniques as well as interventions on the internal representations used in the MT system. Our results show that gender information can be found in all token representations built by the encoder and the decoder and lead us to conclude that there are multiple pathways for gender transfer.

pdf
The SPECTRANS System Description for the WMT21 Terminology Task
Nicolas Ballier | Dahn Cho | Bilal Faye | Zong-You Ke | Hanna Martikainen | Mojca Pecman | Guillaume Wisniewski | Jean-Baptiste Yunès | Lichao Zhu | Maria Zimina-Poirot
Proceedings of the Sixth Conference on Machine Translation

This paper discusses the WMT 2021 terminology shared task from a “meta” perspective. We present the results of our experiments using the terminology dataset and the OpenNMT (Klein et al., 2017) and JoeyNMT (Kreutzer et al., 2019) toolkits for the language direction English to French. Our experiment 1 compares the predictions of the two toolkits. Experiment 2 uses OpenNMT to fine-tune the model. We report our results for the task with the evaluation script but mostly discuss the linguistic properties of the terminology dataset provided for the task. We provide evidence of the importance of text genres across scores, having replicated the evaluation scripts.

pdf bib
Biais de genre dans un système de traduction automatiqueneuronale : une étude préliminaire (Gender Bias in Neural Translation : a preliminary study )
Guillaume Wisniewski | Lichao Zhu | Nicolas Ballier | François Yvon
Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale

Cet article présente les premiers résultats d’une étude en cours sur les biais de genre dans les corpus d’entraînements et dans les systèmes de traduction neuronale. Nous étudions en particulier un corpus minimal et contrôlé pour mesurer l’intensité de ces biais dans les deux directions anglais-français et français-anglais ; ce cadre contrôlé nous permet également d’analyser les représentations internes manipulées par le système pour réaliser ses prédictions lexicales, ainsi que de formuler des hypothèses sur la manière dont ce biais se distribue dans les représentations du système.