Bridging Philippine Languages With Multilingual Neural Machine Translation

Renz Iver Baliber, Charibeth Cheng, Kristine Mae Adlaon, Virgion Mamonong


Abstract
The Philippines is home to more than 150 languages that is considered to be low-resourced even on its major languages. This results into a lack of pursuit in developing a translation system for the underrepresented languages. To simplify the process of developing translation system for multiple languages, and to aid in improving the translation quality of zero to low-resource languages, multilingual NMT became an active area of research. However, existing works in multilingual NMT disregards the analysis of a multilingual model on a closely related and low-resource language group in the context of pivot-based translation and zero-shot translation. In this paper, we benchmarked translation for several Philippine Languages, provided an analysis of a multilingual NMT system for morphologically rich and low-resource languages in terms of its effectiveness in translating zero-resource languages with zero-shot translations. To further evaluate the capability of the multilingual NMT model in translating unseen language pairs in training, we tested the model to translate between Tagalog and Cebuano and compared its performance with a simple NMT model that is directly trained on a parallel Tagalog and Cebuano data in which we showed that zero-shot translation outperforms a directly trained model in some instances, while utilizing English as a pivot language in translating outperform both approaches.
Anthology ID:
2020.loresmt-1.2
Volume:
Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages
Month:
December
Year:
2020
Address:
Suzhou, China
Editors:
Alina Karakanta, Atul Kr. Ojha, Chao-Hong Liu, Jade Abbott, John Ortega, Jonathan Washington, Nathaniel Oco, Surafel Melaku Lakew, Tommi A Pirinen, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venue:
LoResMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14–22
Language:
URL:
https://aclanthology.org/2020.loresmt-1.2
DOI:
Bibkey:
Cite (ACL):
Renz Iver Baliber, Charibeth Cheng, Kristine Mae Adlaon, and Virgion Mamonong. 2020. Bridging Philippine Languages With Multilingual Neural Machine Translation. In Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages, pages 14–22, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Bridging Philippine Languages With Multilingual Neural Machine Translation (Baliber et al., LoResMT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2020.loresmt-1.2.pdf