Manipuri-English Machine Translation using Comparable Corpus

Lenin Laitonjam, Sanasam Ranbir Singh


Abstract
Unsupervised Machine Translation (MT) model, which has the ability to perform MT without parallel sentences using comparable corpora, is becoming a promising approach for developing MT in low-resource languages. However, majority of the studies in unsupervised MT have considered resource-rich language pairs with similar linguistic characteristics. In this paper, we investigate the effectiveness of unsupervised MT models over a Manipuri-English comparable corpus. Manipuri is a low-resource language having different linguistic characteristics from that of English. This paper focuses on identifying challenges in building unsupervised MT models over the comparable corpus. From various experimental observations, it is evident that the development of MT over comparable corpus using unsupervised methods is feasible. Further, the paper also identifies future directions of developing effective MT for Manipuri-English language pair under unsupervised scenarios.
Anthology ID:
2021.mtsummit-loresmt.8
Volume:
Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021)
Month:
August
Year:
2021
Address:
Virtual
Venue:
LoResMT
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
78–88
Language:
URL:
https://aclanthology.org/2021.mtsummit-loresmt.8
DOI:
Bibkey:
Cite (ACL):
Lenin Laitonjam and Sanasam Ranbir Singh. 2021. Manipuri-English Machine Translation using Comparable Corpus. In Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021), pages 78–88, Virtual. Association for Machine Translation in the Americas.
Cite (Informal):
Manipuri-English Machine Translation using Comparable Corpus (Laitonjam & Ranbir Singh, LoResMT 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2021.mtsummit-loresmt.8.pdf