Nazareth Amlesom Kifle


2020

pdf
Character Alignment in Morphologically Complex Translation Sets for Related Languages
Michael Gasser | Binyam Ephrem Seyoum | Nazareth Amlesom Kifle
Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects

For languages with complex morphology, word-to-word translation is a task with various potential applications, for example, in information retrieval, language instruction, and dictionary creation, as well as in machine translation. In this paper, we confine ourselves to the subtask of character alignment for the particular case of families of related languages with very few resources for most or all members. There are many such families; we focus on the subgroup of Semitic languages spoken in Ethiopia and Eritrea. We begin with an adaptation of the familiar alignment algorithms behind statistical machine translation, modifying them as appropriate for our task. We show how character alignment can reveal morphological, phonological, and orthographic correspondences among related languages.