Abstract
This paper describes the development process of NMT systems that were submitted to the WMT 2023 General Translation task by the team of AIST AIRC. We trained constrained track models for translation between English, German, and Japanese. Before training the final models, we first filtered the parallel and monolingual data, then performed iterative back-translation as well as parallel data distillation to be used for non-autoregressive model training. We experimented with training Transformer models, Mega models, and custom non-autoregressive sequence-to-sequence models with encoder and decoder weights initialised by a multilingual BERT base. Our primary submissions contain translations from ensembles of two Mega model checkpoints and our contrastive submissions are generated by our non-autoregressive models.- Anthology ID:
- 2023.wmt-1.13
- Volume:
- Proceedings of the Eighth Conference on Machine Translation
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 155–161
- Language:
- URL:
- https://aclanthology.org/2023.wmt-1.13
- DOI:
- 10.18653/v1/2023.wmt-1.13
- Cite (ACL):
- Matiss Rikters and Makoto Miwa. 2023. AIST AIRC Submissions to the WMT23 Shared Task. In Proceedings of the Eighth Conference on Machine Translation, pages 155–161, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- AIST AIRC Submissions to the WMT23 Shared Task (Rikters & Miwa, WMT 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2023.wmt-1.13.pdf