Miðeind’s WMT 2021 Submission

Haukur Barri Símonarson, Vésteinn Snæbjarnarson, Pétur Orri Ragnarson, Haukur Jónsson, Vilhjalmur Thorsteinsson


Abstract
We present Miðeind’s submission for the English→Icelandic and Icelandic→English subsets of the 2021 WMT news translation task. Transformer-base models are trained for translation on parallel data to generate backtranslations teratively. A pretrained mBART-25 model is then adapted for translation using parallel data as well as the last backtranslation iteration. This adapted pretrained model is then used to re-generate backtranslations, and the training of the adapted model is continued.
Anthology ID:
2021.wmt-1.9
Volume:
Proceedings of the Sixth Conference on Machine Translation
Month:
November
Year:
2021
Address:
Online
Editors:
Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
136–139
Language:
URL:
https://aclanthology.org/2021.wmt-1.9
DOI:
Bibkey:
Cite (ACL):
Haukur Barri Símonarson, Vésteinn Snæbjarnarson, Pétur Orri Ragnarson, Haukur Jónsson, and Vilhjalmur Thorsteinsson. 2021. Miðeind’s WMT 2021 Submission. In Proceedings of the Sixth Conference on Machine Translation, pages 136–139, Online. Association for Computational Linguistics.
Cite (Informal):
Miðeind’s WMT 2021 Submission (Símonarson et al., WMT 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/2021.wmt-1.9.pdf
Data
CCMatrixIPACParaCrawl