The LMU Munich Unsupervised Machine Translation System for WMT19
Dario Stojanovski, Viktor Hangya, Matthias Huck, Alexander Fraser
Abstract
We describe LMU Munich’s machine translation system for German→Czech translation which was used to participate in the WMT19 shared task on unsupervised news translation. We train our model using monolingual data only from both languages. The final model is an unsupervised neural model using established techniques for unsupervised translation such as denoising autoencoding and online back-translation. We bootstrap the model with masked language model pretraining and enhance it with back-translations from an unsupervised phrase-based system which is itself bootstrapped using unsupervised bilingual word embeddings.- Anthology ID:
- W19-5344
- Volume:
- Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 393–399
- Language:
- URL:
- https://aclanthology.org/W19-5344
- DOI:
- 10.18653/v1/W19-5344
- Cite (ACL):
- Dario Stojanovski, Viktor Hangya, Matthias Huck, and Alexander Fraser. 2019. The LMU Munich Unsupervised Machine Translation System for WMT19. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 393–399, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- The LMU Munich Unsupervised Machine Translation System for WMT19 (Stojanovski et al., WMT 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/W19-5344.pdf