Domain Adaptation for Hindi-Telugu Machine Translation Using Domain Specific Back Translation

Hema Ala, Vandan Mujadia, Dipti Sharma


Abstract
In this paper, we present a novel approachfor domain adaptation in Neural MachineTranslation which aims to improve thetranslation quality over a new domain.Adapting new domains is a highly challeng-ing task for Neural Machine Translation onlimited data, it becomes even more diffi-cult for technical domains such as Chem-istry and Artificial Intelligence due to spe-cific terminology, etc. We propose DomainSpecific Back Translation method whichuses available monolingual data and gen-erates synthetic data in a different way.This approach uses Out Of Domain words.The approach is very generic and can beapplied to any language pair for any domain. We conduct our experiments onChemistry and Artificial Intelligence do-mains for Hindi and Telugu in both direc-tions. It has been observed that the usageof synthetic data created by the proposedalgorithm improves the BLEU scores significantly.
Anthology ID:
2021.ranlp-1.4
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
26–34
Language:
URL:
https://aclanthology.org/2021.ranlp-1.4
DOI:
Bibkey:
Cite (ACL):
Hema Ala, Vandan Mujadia, and Dipti Sharma. 2021. Domain Adaptation for Hindi-Telugu Machine Translation Using Domain Specific Back Translation. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 26–34, Held Online. INCOMA Ltd..
Cite (Informal):
Domain Adaptation for Hindi-Telugu Machine Translation Using Domain Specific Back Translation (Ala et al., RANLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2021.ranlp-1.4.pdf