Abstract
We describe our neural machine translation systems for the 2021 shared task on Unsupervised and Very Low Resource Supervised MT, translating between Upper Sorbian and German (low-resource) and between Lower Sorbian and German (unsupervised). The systems incorporated data filtering, backtranslation, BPE-dropout, ensembling, and transfer learning from high(er)-resource languages. As measured by automatic metrics, our systems showed strong performance, consistently placing first or tied for first across most metrics and translation directions.- Anthology ID:
- 2021.wmt-1.107
- Volume:
- Proceedings of the Sixth Conference on Machine Translation
- Month:
- November
- Year:
- 2021
- Address:
- Online
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 999–1008
- Language:
- URL:
- https://aclanthology.org/2021.wmt-1.107
- DOI:
- Cite (ACL):
- Rebecca Knowles and Samuel Larkin. 2021. NRC-CNRC Systems for Upper Sorbian-German and Lower Sorbian-German Machine Translation 2021. In Proceedings of the Sixth Conference on Machine Translation, pages 999–1008, Online. Association for Computational Linguistics.
- Cite (Informal):
- NRC-CNRC Systems for Upper Sorbian-German and Lower Sorbian-German Machine Translation 2021 (Knowles & Larkin, WMT 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.wmt-1.107.pdf