Anush Kumar


2019

pdf
Study on Unsupervised Statistical Machine Translation for Backtranslation
Anush Kumar | Nihal V. Nayak | Aditya Chandra | Mydhili K. Nair
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Machine Translation systems have drastically improved over the years for several language pairs. Monolingual data is often used to generate synthetic sentences to augment the training data which has shown to improve the performance of machine translation models. In our paper, we make use of an Unsupervised Statistical Machine Translation (USMT) to generate synthetic sentences. Our study compares the performance improvements in Neural Machine Translation model when using synthetic sentences from supervised and unsupervised Machine Translation models. Our approach of using USMT for backtranslation shows promise in low resource conditions and achieves an improvement of 3.2 BLEU score over the Neural Machine Translation model.