Love Thy Neighbor: Combining Two Neighboring Low-Resource Languages for Translation
John E. Ortega, Richard Alexander Castro Mamani, Jaime Rafael Montoya Samame
Abstract
Low-resource languages sometimes take on similar morphological and syntactic characteristics due to their geographic nearness and shared history. Two low-resource neighboring languages found in Peru, Quechua and Ashaninka, can be considered, at first glance, two languages that are morphologically similar. In order to translate the two languages, various approaches have been taken. For Quechua, neural machine transfer-learning has been used along with byte-pair encoding. For Ashaninka, the language of the two with fewer resources, a finite-state transducer is used to transform Ashaninka texts and its dialects for machine translation use. We evaluate and compare two approaches by attempting to use newly-formed Ashaninka corpora for neural machine translation. Our experiments show that combining the two neighboring languages, while similar in morphology, word sharing, and geographical location, improves Ashaninka– Spanish translation but degrades Quechua–Spanish translations.- Anthology ID:
- 2021.mtsummit-loresmt.5
- Volume:
- Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021)
- Month:
- August
- Year:
- 2021
- Address:
- Virtual
- Editors:
- John Ortega, Atul Kr. Ojha, Katharina Kann, Chao-Hong Liu
- Venue:
- LoResMT
- SIG:
- Publisher:
- Association for Machine Translation in the Americas
- Note:
- Pages:
- 44–51
- Language:
- URL:
- https://aclanthology.org/2021.mtsummit-loresmt.5
- DOI:
- Cite (ACL):
- John E. Ortega, Richard Alexander Castro Mamani, and Jaime Rafael Montoya Samame. 2021. Love Thy Neighbor: Combining Two Neighboring Low-Resource Languages for Translation. In Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021), pages 44–51, Virtual. Association for Machine Translation in the Americas.
- Cite (Informal):
- Love Thy Neighbor: Combining Two Neighboring Low-Resource Languages for Translation (Ortega et al., LoResMT 2021)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2021.mtsummit-loresmt.5.pdf