MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

Kshitij Gupta

MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

Abstract

Large pre-trained language models have brought remarkable progress in NLP. Pre-training and Fine-tuning have given state-of-art performance across tasks in text processing. Data Augmentation techniques have also helped build state-of-art models on low or zero resource tasks. Many works in the past have attempted at learning a single massively multilingual machine translation model for zero-shot translation. Although those translation models are producing correct translations, the main challenge is those models are producing the wrong languages for zero-shot translation. This work and its results indicate that prompt conditioned large models do not suffer from off-target language errors i.e. errors arising due to translation to wrong languages. We empirically demonstrate the effectiveness of self-supervised pre-training and data augmentation for zero-shot multi-lingual machine translation.

Anthology ID:: 2022.nlp4dh-1.8
Volume:: Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities
Month:: November
Year:: 2022
Address:: Taipei, Taiwan
Editors:: Mika Hämäläinen, Khalid Alnajjar, Niko Partanen, Jack Rueter
Venue:: NLP4DH
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 53–58
Language:
URL:: https://aclanthology.org/2022.nlp4dh-1.8
DOI:
Bibkey:
Cite (ACL):: Kshitij Gupta. 2022. MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation. In Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities, pages 53–58, Taipei, Taiwan. Association for Computational Linguistics.
Cite (Informal):: MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation (Gupta, NLP4DH 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/proper-vol2-ingestion/2022.nlp4dh-1.8.pdf

PDF Search