Amulya Dash


2023

This paper describes the proposed system for mutlimodal machine translation. We have participated in multimodal translation tasks for English into three Indic languages: Hindi, Bengali, and Malayalam. We leverage the inherent richness of multimodal data to bridge the gap of ambiguity in translation. We fine-tuned the ‘No Language Left Behind’ (NLLB) machine translation model for multimodal translation, further enhancing the model accuracy by image data augmentation using latent diffusion. Our submission achieves the best BLEU score for English-Hindi, English-Bengali, and English-Malayalam language pairs for both Evaluation and Challenge test sets.

2021

This paper describes the team (“Tamalli”)’s submission to AmericasNLP2021 shared task on Open Machine Translation for low resource South American languages. Our goal was to evaluate different Machine Translation (MT) techniques, statistical and neural-based, under several configuration settings. We obtained the second-best results for the language pairs “Spanish-Bribri”, “Spanish-Asháninka”, and “Spanish-Rarámuri” in the category “Development set not used for training”. Our performed experiments will serve as a point of reference for researchers working on MT with low-resource languages.