Neural Machine Translation between Myanmar (Burmese) and Rakhine (Arakanese)

Thazin Myint Oo, Ye Kyaw Thu, Khin Mar Soe


Abstract
This work explores neural machine translation between Myanmar (Burmese) and Rakhine (Arakanese). Rakhine is a language closely related to Myanmar, often considered a dialect. We implemented three prominent neural machine translation (NMT) systems: recurrent neural networks (RNN), transformer, and convolutional neural networks (CNN). The systems were evaluated on a Myanmar-Rakhine parallel text corpus developed by us. In addition, two types of word segmentation schemes for word embeddings were studied: Word-BPE and Syllable-BPE segmentation. Our experimental results clearly show that the highest quality NMT and statistical machine translation (SMT) performances are obtained with Syllable-BPE segmentation for both types of translations. If we focus on NMT, we find that the transformer with Word-BPE segmentation outperforms CNN and RNN for both Myanmar-Rakhine and Rakhine-Myanmar translation. However, CNN with Syllable-BPE segmentation obtains a higher score than the RNN and transformer.
Anthology ID:
W19-1408
Volume:
Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects
Month:
June
Year:
2019
Address:
Ann Arbor, Michigan
Venue:
VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
80–88
Language:
URL:
https://aclanthology.org/W19-1408
DOI:
10.18653/v1/W19-1408
Bibkey:
Cite (ACL):
Thazin Myint Oo, Ye Kyaw Thu, and Khin Mar Soe. 2019. Neural Machine Translation between Myanmar (Burmese) and Rakhine (Arakanese). In Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects, pages 80–88, Ann Arbor, Michigan. Association for Computational Linguistics.
Cite (Informal):
Neural Machine Translation between Myanmar (Burmese) and Rakhine (Arakanese) (Myint Oo et al., VarDial 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W19-1408.pdf