Dmitrii Lichko
2025
Low-resource Buryat-Russian neural machine translation
Dari Baturova
|
Sarana Abidueva
|
Dmitrii Lichko
|
Ivan Bondarenko
Proceedings of the Fourth Workshop on NLP Applications to Field Linguistics
This paper presents a study on the development of a neural machine translation (NMT) system for the Russian-Buryat language pair, focusing on addressing the challenges of low-resource translation.We also present a parallel corpus, constructed by processing existing texts and organizing the translation process, supplemented by data augmentation techniques to enhance model training. We managed to achieve BLEU score of 20 and 35 for translation to Buryat andRussian respectively. Native speakers have evaluated the translations as acceptable.Future directions include expanding and cleaning the dataset, improving model training techniques, and exploring dialectal variations within the Buryat language.