Abstract
We describe our neural machine translation systems for the 2021 shared task on Unsupervised and Very Low Resource Supervised MT, translating between Upper Sorbian and German (low-resource) and between Lower Sorbian and German (unsupervised). The systems incorporated data filtering, backtranslation, BPE-dropout, ensembling, and transfer learning from high(er)-resource languages. As measured by automatic metrics, our systems showed strong performance, consistently placing first or tied for first across most metrics and translation directions.- Anthology ID:
- 2021.wmt-1.107
- Volume:
- Proceedings of the Sixth Conference on Machine Translation
- Month:
- November
- Year:
- 2021
- Address:
- Online
- Editors:
- Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 999–1008
- Language:
- URL:
- https://aclanthology.org/2021.wmt-1.107
- DOI:
- Cite (ACL):
- Rebecca Knowles and Samuel Larkin. 2021. NRC-CNRC Systems for Upper Sorbian-German and Lower Sorbian-German Machine Translation 2021. In Proceedings of the Sixth Conference on Machine Translation, pages 999–1008, Online. Association for Computational Linguistics.
- Cite (Informal):
- NRC-CNRC Systems for Upper Sorbian-German and Lower Sorbian-German Machine Translation 2021 (Knowles & Larkin, WMT 2021)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2021.wmt-1.107.pdf