Wen Lai


2022

pdf
Improving Both Domain Robustness and Domain Adaptability in Machine Translation
Wen Lai | Jindřich Libovický | Alexander Fraser
Proceedings of the 29th International Conference on Computational Linguistics

We consider two problems of NMT domain adaptation using meta-learning. First, we want to reach domain robustness, i.e., we want to reach high quality on both domains seen in the training data and unseen domains. Second, we want our systems to be adaptive, i.e., making it possible to finetune systems with just hundreds of in-domain parallel sentences. We study the domain adaptability of meta-learning when improving the domain robustness of the model. In this paper, we propose a novel approach, RMLNMT (Robust Meta-Learning Framework for Neural Machine Translation Domain Adaptation), which improves the robustness of existing meta-learning models. More specifically, we show how to use a domain classifier in curriculum learning and we integrate the word-level domain mixing model into the meta-learning framework with a balanced sampling strategy. Experiments on English-German and English-Chinese translation show that RMLNMT improves in terms of both domain robustness and domain adaptability in seen and unseen domains.

2021

pdf
The LMU Munich System for the WMT 2021 Large-Scale Multilingual Machine Translation Shared Task
Wen Lai | Jindřich Libovický | Alexander Fraser
Proceedings of the Sixth Conference on Machine Translation

This paper describes the submission of LMU Munich to the WMT 2021 multilingual machine translation task for small track #1, which studies translation between 6 languages (Croatian, Hungarian, Estonian, Serbian, Macedonian, English) in 30 directions. We investigate the extent to which bilingual translation systems can influence multilingual translation systems. More specifically, we trained 30 bilingual translation systems, covering all language pairs, and used data augmentation technologies such as back-translation and knowledge distillation to improve the multilingual translation systems. Our best translation system scores 5 to 6 BLEU higher than a strong baseline system provided by the organizers. As seen in the dynalab leaderboard, our submission is the only fully constrained submission that uses only the corpus provided by the organizers and does not use any pre-trained models.

2018

pdf
Tibetan-Chinese Neural Machine Translation based on Syllable Segmentation
Wen Lai | Xiaobing Zhao | Wei Bao
Proceedings of the AMTA 2018 Workshop on Technologies for MT of Low Resource Languages (LoResMT 2018)