Sandeep Dash


2024

pdf
Findings of WMT 2024 Shared Task on Low-Resource Indic Languages Translation
Partha Pakray | Santanu Pal | Advaitha Vetagiri | Reddi Krishna | Arnab Kumar Maji | Sandeep Dash | Lenin Laitonjam | Lyngdoh Sarah | Riyanka Manna
Proceedings of the Ninth Conference on Machine Translation

This paper presents the results of the low-resource Indic language translation task, organized in conjunction with the Ninth Conference on Machine Translation (WMT) 2024. In this edition, participants were challenged to develop machine translation models for four distinct language pairs: English–Assamese, English-Mizo, English-Khasi, and English-Manipuri. The task utilized the enriched IndicNE-Corp1.0 dataset, which includes an extensive collection of parallel and monolingual corpora for northeastern Indic languages. The evaluation was conducted through a comprehensive suite of automatic metrics—BLEU, TER, RIBES, METEOR, and ChrF—supplemented by meticulous human assessment to measure the translation systems’ performance and accuracy. This initiative aims to drive advancements in low-resource machine translation and make a substantial contribution to the growing body of knowledge in this dynamic field.