Abstract
We present our submission to the very low resource supervised machine translation task at WMT20. We use a decoder-only transformer architecture and formulate the translation task as language modeling. To address the low-resource aspect of the problem, we pretrain over a similar language parallel corpus. Then, we employ an intermediate back-translation step before fine-tuning. Finally, we present an analysis of the system’s performance.- Anthology ID:
- 2020.wmt-1.127
- Volume:
- Proceedings of the Fifth Conference on Machine Translation
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1079–1083
- Language:
- URL:
- https://aclanthology.org/2020.wmt-1.127
- DOI:
- Cite (ACL):
- Tucker Berckmann and Berkan Hiziroglu. 2020. Low-Resource Translation as Language Modeling. In Proceedings of the Fifth Conference on Machine Translation, pages 1079–1083, Online. Association for Computational Linguistics.
- Cite (Informal):
- Low-Resource Translation as Language Modeling (Berckmann & Hiziroglu, WMT 2020)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2020.wmt-1.127.pdf