Iterative Back Translation Revisited: An Experimental Investigation for Low-resource English Assamese Neural Machine Translation
Mazida Ahmed, Kishore Kashyap, Kuwali Talukdar, Parvez Boruah
Abstract
Back Translation has been an effective strategy to leverage monolingual data both on the source and target sides. Research have opened up several ways to improvise the procedure, one among them is iterative back translation where the monolingual data is repeatedly translated and used for re-training for the model enhancement. Despite its success, iterative back translation remains relatively unexplored in low-resource scenarios, particularly for rich Indic languages. This paper presents a comprehensive investigation into the application of iterative back translation to the low-resource English-Assamese language pair. A simplified version of iterative back translation is presented. This study explores various critical aspects associated with back translation, including the balance between original and synthetic data and the refinement of the target (backward) model through cleaner data retraining. The experimental results demonstrate significant improvements in translation quality. Specifically, the simplistic approach to iterative back translation yields a noteworthy +6.38 BLEU score improvement for the EnglishAssamese translation direction and a +4.38 BLEU score improvement for the AssameseEnglish translation direction. Further enhancements are further noticed when incorporating higher-quality, cleaner data for model retraining highlighting the potential of iterative back translation as a valuable tool for enhancing low-resource neural machine translation (NMT).- Anthology ID:
- 2023.icon-1.17
- Volume:
- Proceedings of the 20th International Conference on Natural Language Processing (ICON)
- Month:
- December
- Year:
- 2023
- Address:
- Goa University, Goa, India
- Editors:
- Jyoti D. Pawar, Sobha Lalitha Devi
- Venue:
- ICON
- SIG:
- SIGLEX
- Publisher:
- NLP Association of India (NLPAI)
- Note:
- Pages:
- 172–179
- Language:
- URL:
- https://aclanthology.org/2023.icon-1.17
- DOI:
- Cite (ACL):
- Mazida Ahmed, Kishore Kashyap, Kuwali Talukdar, and Parvez Boruah. 2023. Iterative Back Translation Revisited: An Experimental Investigation for Low-resource English Assamese Neural Machine Translation. In Proceedings of the 20th International Conference on Natural Language Processing (ICON), pages 172–179, Goa University, Goa, India. NLP Association of India (NLPAI).
- Cite (Informal):
- Iterative Back Translation Revisited: An Experimental Investigation for Low-resource English Assamese Neural Machine Translation (Ahmed et al., ICON 2023)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2023.icon-1.17.pdf