Saralin A. Lyngdoh
2025
Findings of WMT 2025 Shared Task on Low-resource Indic Languages Translation
Partha Pakray
|
Reddi Krishna
|
Santanu Pal
|
Advaitha Vetagiri
|
Sandeep Dash
|
Arnab Kumar Maji
|
Saralin A. Lyngdoh
|
Lenin Laitonjam
|
Anupam Jamatia
|
Koj Sambyo
|
Ajit Das
|
Riyanka Manna
Proceedings of the Tenth Conference on Machine Translation
This study proposes the results of the lowresource Indic language translation task organized in collaboration with the Tenth Conference on Machine Translation (WMT) 2025. In this workshop, participants were required to build and develop machine translation models for the seven language pairs, which were categorized into two categories. Category 1 is moderate training data available in languages i.e English–Assamese, English–Mizo, English-Khasi, English–Manipuri and English– Nyishi. Category 2 has very limited training data available in languages, i.e English–Bodo and English–Kokborok. This task leverages the enriched IndicNE-corp1.0 dataset, which consists of an extensive collection of parallel and monilingual corpora for north eastern Indic languages. The participant results were evaluated using automatic machine translation metrics, including BLEU, TER, ROUGE-L, ChrF, and METEOR. Along with those metrics, this year’s work also includes Cosine similarity for evaluation, which captures the semantic representation of the sentence to measure the performance and accuracy of the models. This work aims to promote innovation and advancements in low-resource Indic languages.
Search
Fix author
Co-authors
- Ajit Das 1
- Sandeep Dash 1
- Anupam Jamatia 1
- Reddi Krishna 1
- Lenin Laitonjam 1
- show all...
Venues
- wmt1