The LMU Munich System for the WMT 2021 Large-Scale Multilingual Machine Translation Shared Task

Wen Lai; Jindřich Libovický; Alexander Fraser

The LMU Munich System for the WMT 2021 Large-Scale Multilingual Machine Translation Shared Task

Wen Lai, Jindřich Libovický, Alexander Fraser

Abstract

This paper describes the submission of LMU Munich to the WMT 2021 multilingual machine translation task for small track #1, which studies translation between 6 languages (Croatian, Hungarian, Estonian, Serbian, Macedonian, English) in 30 directions. We investigate the extent to which bilingual translation systems can influence multilingual translation systems. More specifically, we trained 30 bilingual translation systems, covering all language pairs, and used data augmentation technologies such as back-translation and knowledge distillation to improve the multilingual translation systems. Our best translation system scores 5 to 6 BLEU higher than a strong baseline system provided by the organizers. As seen in the dynalab leaderboard, our submission is the only fully constrained submission that uses only the corpus provided by the organizers and does not use any pre-trained models.

Anthology ID:: 2021.wmt-1.49
Volume:: Proceedings of the Sixth Conference on Machine Translation
Month:: November
Year:: 2021
Address:: Online
Editors:: Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 412–417
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.wmt-1.49/
DOI:
Bibkey:
Cite (ACL):: Wen Lai, Jindřich Libovický, and Alexander Fraser. 2021. The LMU Munich System for the WMT 2021 Large-Scale Multilingual Machine Translation Shared Task. In Proceedings of the Sixth Conference on Machine Translation, pages 412–417, Online. Association for Computational Linguistics.
Cite (Informal):: The LMU Munich System for the WMT 2021 Large-Scale Multilingual Machine Translation Shared Task (Lai et al., WMT 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.wmt-1.49.pdf

PDF Cite Search Fix data