Abstract
This paper presents the systems submitted by the Yes-MT team for the Low-Resource Indic Language Translation Shared Task at WMT 2024, focusing on translating between English and the Assamese, Mizo, Khasi, and Manipuri languages. The experiments explored various approaches, including fine-tuning pre-trained models like mT5 and IndicBart in both Multilingual and Monolingual settings, LoRA finetune IndicTrans2, zero-shot and few-shot prompting with large language models (LLMs) like Llama 3 and Mixtral 8x7b, LoRA Supervised Fine Tuning Llama 3, and training Transformers from scratch. The results were evaluated on the WMT23 Low-Resource Indic Language Translation Shared Task’s test data using SacreBLEU and CHRF highlighting the challenges of low-resource translation and show the potential of LLMs for these tasks, particularly with fine-tuning.- Anthology ID:
- 2024.wmt-1.71
- Volume:
- Proceedings of the Ninth Conference on Machine Translation
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
- Venue:
- WMT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 788–792
- Language:
- URL:
- https://aclanthology.org/2024.wmt-1.71
- DOI:
- 10.18653/v1/2024.wmt-1.71
- Cite (ACL):
- Yash Bhaskar and Parameswari Krishnamurthy. 2024. Yes-MT’s Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024. In Proceedings of the Ninth Conference on Machine Translation, pages 788–792, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Yes-MT’s Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024 (Bhaskar & Krishnamurthy, WMT 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.wmt-1.71.pdf