Abstract
This work explores neural machine translation between Myanmar (Burmese) and Rakhine (Arakanese). Rakhine is a language closely related to Myanmar, often considered a dialect. We implemented three prominent neural machine translation (NMT) systems: recurrent neural networks (RNN), transformer, and convolutional neural networks (CNN). The systems were evaluated on a Myanmar-Rakhine parallel text corpus developed by us. In addition, two types of word segmentation schemes for word embeddings were studied: Word-BPE and Syllable-BPE segmentation. Our experimental results clearly show that the highest quality NMT and statistical machine translation (SMT) performances are obtained with Syllable-BPE segmentation for both types of translations. If we focus on NMT, we find that the transformer with Word-BPE segmentation outperforms CNN and RNN for both Myanmar-Rakhine and Rakhine-Myanmar translation. However, CNN with Syllable-BPE segmentation obtains a higher score than the RNN and transformer.- Anthology ID:
- W19-1408
- Volume:
- Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects
- Month:
- June
- Year:
- 2019
- Address:
- Ann Arbor, Michigan
- Editors:
- Marcos Zampieri, Preslav Nakov, Shervin Malmasi, Nikola Ljubešić, Jörg Tiedemann, Ahmed Ali
- Venue:
- VarDial
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 80–88
- Language:
- URL:
- https://aclanthology.org/W19-1408
- DOI:
- 10.18653/v1/W19-1408
- Cite (ACL):
- Thazin Myint Oo, Ye Kyaw Thu, and Khin Mar Soe. 2019. Neural Machine Translation between Myanmar (Burmese) and Rakhine (Arakanese). In Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects, pages 80–88, Ann Arbor, Michigan. Association for Computational Linguistics.
- Cite (Informal):
- Neural Machine Translation between Myanmar (Burmese) and Rakhine (Arakanese) (Myint Oo et al., VarDial 2019)
- PDF:
- https://preview.aclanthology.org/naacl24-info/W19-1408.pdf