Miðeind’s WMT 2021 Submission
Haukur Barri Símonarson, Vésteinn Snæbjarnarson, Pétur Orri Ragnarson, Haukur Jónsson, Vilhjalmur Thorsteinsson
Abstract
We present Miðeind’s submission for the English→Icelandic and Icelandic→English subsets of the 2021 WMT news translation task. Transformer-base models are trained for translation on parallel data to generate backtranslations teratively. A pretrained mBART-25 model is then adapted for translation using parallel data as well as the last backtranslation iteration. This adapted pretrained model is then used to re-generate backtranslations, and the training of the adapted model is continued.- Anthology ID:
- 2021.wmt-1.9
- Volume:
- Proceedings of the Sixth Conference on Machine Translation
- Month:
- November
- Year:
- 2021
- Address:
- Online
- Editors:
- Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 136–139
- Language:
- URL:
- https://aclanthology.org/2021.wmt-1.9
- DOI:
- Cite (ACL):
- Haukur Barri Símonarson, Vésteinn Snæbjarnarson, Pétur Orri Ragnarson, Haukur Jónsson, and Vilhjalmur Thorsteinsson. 2021. Miðeind’s WMT 2021 Submission. In Proceedings of the Sixth Conference on Machine Translation, pages 136–139, Online. Association for Computational Linguistics.
- Cite (Informal):
- Miðeind’s WMT 2021 Submission (Símonarson et al., WMT 2021)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2021.wmt-1.9.pdf
- Data
- CCMatrix, IPAC, ParaCrawl