Findings of the WMT 2023 Shared Task on Low-Resource Indic Language Translation

Santanu Pal, Partha Pakray, Sahinur Rahman Laskar, Lenin Laitonjam, Vanlalmuansangi Khenglawt, Sunita Warjri, Pankaj Kundan Dadure, Sandeep Kumar Dash


Abstract
This paper presents the results of the low-resource Indic language translation task organized alongside the Eighth Conference on Machine Translation (WMT) 2023. In this task, participants were asked to build machine translation systems for any of four language pairs, namely, English-Assamese, English-Mizo, English-Khasi, and English-Manipuri. For this task, the IndicNE-Corp1.0 dataset is released, which consists of parallel and monolingual corpora for northeastern Indic languages such as Assamese, Mizo, Khasi, and Manipuri. The evaluation will be carried out using automatic evaluation metrics (BLEU, TER, RIBES, COMET, ChrF) and human evaluation.
Anthology ID:
2023.wmt-1.56
Volume:
Proceedings of the Eighth Conference on Machine Translation
Month:
December
Year:
2023
Address:
Singapore
Editors:
Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
682–694
Language:
URL:
https://aclanthology.org/2023.wmt-1.56
DOI:
10.18653/v1/2023.wmt-1.56
Bibkey:
Cite (ACL):
Santanu Pal, Partha Pakray, Sahinur Rahman Laskar, Lenin Laitonjam, Vanlalmuansangi Khenglawt, Sunita Warjri, Pankaj Kundan Dadure, and Sandeep Kumar Dash. 2023. Findings of the WMT 2023 Shared Task on Low-Resource Indic Language Translation. In Proceedings of the Eighth Conference on Machine Translation, pages 682–694, Singapore. Association for Computational Linguistics.
Cite (Informal):
Findings of the WMT 2023 Shared Task on Low-Resource Indic Language Translation (Pal et al., WMT 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2023.wmt-1.56.pdf