Zero-Shot Neural Machine Translation: Russian-Hindi @LoResMT 2020

Sahinur Rahman Laskar, Abdullah Faiz Ur Rahman Khilji, Partha Pakray, Sivaji Bandyopadhyay


Abstract
Neural machine translation (NMT) is a widely accepted approach in the machine translation (MT) community, translating from one natural language to another natural language. Although, NMT shows remarkable performance in both high and low resource languages, it needs sufficient training corpus. The availability of a parallel corpus in low resource language pairs is one of the challenging tasks in MT. To mitigate this issue, NMT attempts to utilize a monolingual corpus to get better at translation for low resource language pairs. Workshop on Technologies for MT of Low Resource Languages (LoResMT 2020) organized shared tasks of low resource language pair translation using zero-shot NMT. Here, the parallel corpus is not used and only monolingual corpora is allowed. We have participated in the same shared task with our team name CNLP-NITS for the Russian-Hindi language pair. We have used masked sequence to sequence pre-training for language generation (MASS) with only monolingual corpus following the unsupervised NMT architecture. The evaluated results are declared at the LoResMT 2020 shared task, which reports that our system achieves the bilingual evaluation understudy (BLEU) score of 0.59, precision score of 3.43, recall score of 5.48, F-measure score of 4.22, and rank-based intuitive bilingual evaluation score (RIBES) of 0.180147 in Russian to Hindi translation. And for Hindi to Russian translation, we have achieved BLEU, precision, recall, F-measure, and RIBES score of 1.11, 4.72, 4.41, 4.56, and 0.026842 respectively.
Anthology ID:
2020.loresmt-1.5
Volume:
Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages
Month:
December
Year:
2020
Address:
Suzhou, China
Editors:
Alina Karakanta, Atul Kr. Ojha, Chao-Hong Liu, Jade Abbott, John Ortega, Jonathan Washington, Nathaniel Oco, Surafel Melaku Lakew, Tommi A Pirinen, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venue:
LoResMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
38–42
Language:
URL:
https://aclanthology.org/2020.loresmt-1.5
DOI:
Bibkey:
Cite (ACL):
Sahinur Rahman Laskar, Abdullah Faiz Ur Rahman Khilji, Partha Pakray, and Sivaji Bandyopadhyay. 2020. Zero-Shot Neural Machine Translation: Russian-Hindi @LoResMT 2020. In Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages, pages 38–42, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Zero-Shot Neural Machine Translation: Russian-Hindi @LoResMT 2020 (Laskar et al., LoResMT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2020.loresmt-1.5.pdf