Study on Unsupervised Statistical Machine Translation for Backtranslation
Anush Kumar, Nihal V. Nayak, Aditya Chandra, Mydhili K. Nair
Abstract
Machine Translation systems have drastically improved over the years for several language pairs. Monolingual data is often used to generate synthetic sentences to augment the training data which has shown to improve the performance of machine translation models. In our paper, we make use of an Unsupervised Statistical Machine Translation (USMT) to generate synthetic sentences. Our study compares the performance improvements in Neural Machine Translation model when using synthetic sentences from supervised and unsupervised Machine Translation models. Our approach of using USMT for backtranslation shows promise in low resource conditions and achieves an improvement of 3.2 BLEU score over the Neural Machine Translation model.- Anthology ID:
- R19-1068
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 578–582
- Language:
- URL:
- https://aclanthology.org/R19-1068
- DOI:
- 10.26615/978-954-452-056-4_068
- Cite (ACL):
- Anush Kumar, Nihal V. Nayak, Aditya Chandra, and Mydhili K. Nair. 2019. Study on Unsupervised Statistical Machine Translation for Backtranslation. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 578–582, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Study on Unsupervised Statistical Machine Translation for Backtranslation (Kumar et al., RANLP 2019)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/R19-1068.pdf