NRC Systems for the 2020 Inuktitut-English News Translation Task

Rebecca Knowles, Darlene Stewart, Samuel Larkin, Patrick Littell


Abstract
We describe the National Research Council of Canada (NRC) submissions for the 2020 Inuktitut-English shared task on news translation at the Fifth Conference on Machine Translation (WMT20). Our submissions consist of ensembled domain-specific finetuned transformer models, trained using the Nunavut Hansard and news data and, in the case of Inuktitut-English, backtranslated news and parliamentary data. In this work we explore challenges related to the relatively small amount of parallel data, morphological complexity, and domain shifts.
Anthology ID:
2020.wmt-1.13
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
156–170
Language:
URL:
https://aclanthology.org/2020.wmt-1.13
DOI:
Bibkey:
Cite (ACL):
Rebecca Knowles, Darlene Stewart, Samuel Larkin, and Patrick Littell. 2020. NRC Systems for the 2020 Inuktitut-English News Translation Task. In Proceedings of the Fifth Conference on Machine Translation, pages 156–170, Online. Association for Computational Linguistics.
Cite (Informal):
NRC Systems for the 2020 Inuktitut-English News Translation Task (Knowles et al., WMT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.wmt-1.13.pdf
Video:
 https://slideslive.com/38939639