Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study
Aniruddha Roy, Pretam Ray, Ayush Maheshwari, Sudeshna Sarkar, Pawan Goyal
Abstract
Neural Machine Translation (NMT) remains a formidable challenge, especially when dealing with low-resource languages. Pre-trained sequence-to-sequence (seq2seq) multi-lingual models, such as mBART-50, have demonstrated impressive performance in various low-resource NMT tasks. However, their pre-training has been confined to 50 languages, leaving out support for numerous low-resource languages, particularly those spoken in the Indian subcontinent. Expanding mBART-50’s language support requires complex pre-training, risking performance decline due to catastrophic forgetting. Considering these expanding challenges, this paper explores a framework that leverages the benefits of a pre-trained language model along with knowledge distillation in a seq2seq architecture to facilitate translation for low-resource languages, including those not covered by mBART-50. The proposed framework employs a multilingual encoder-based seq2seq model as the foundational architecture and subsequently uses complementary knowledge distillation techniques to mitigate the impact of imbalanced training. Our framework is evaluated on three low-resource Indic languages in four Indic-to-Indic directions, yielding significant BLEU-4 and chrF improvements over baselines. Further, we conduct human evaluation to confirm effectiveness of our approach. Our code is publicly available at https://github.com/raypretam/Two-step-low-res-NMT.- Anthology ID:
- 2024.loresmt-1.7
- Volume:
- Proceedings of the The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024)
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jade Abbott, Jonathan Washington, Nathaniel Oco, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
- Venues:
- LoResMT | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 64–73
- Language:
- URL:
- https://aclanthology.org/2024.loresmt-1.7
- DOI:
- Cite (ACL):
- Aniruddha Roy, Pretam Ray, Ayush Maheshwari, Sudeshna Sarkar, and Pawan Goyal. 2024. Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study. In Proceedings of the The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024), pages 64–73, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study (Roy et al., LoResMT-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.loresmt-1.7.pdf