Srinivasan Umesh


2025

pdf bib
Effectively combining Phi-4 and NLLB for Spoken Language Translation: SPRING Lab IITM’s submission to Low Resource Multilingual Indic Track
Sankalpa Sarkar | Samriddhi Kashyap | Advait Joglekar | Srinivasan Umesh
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)

This paper presents the methodologies implemented for Spoken Language Translation for the language pairs Hindi-English, Bengali-English and Tamil-English for the Low Resource Multilingual Indic Track of The International Conference on Spoken Language Translation (IWSLT) for 2025. We adopt a cascaded approach and use a fine-tuned Phi-4 multimodal instruct model for Automatic Speech Recognition(ASR) and a fine-tuned NLLB model for Machine Translation(MT).

2024

pdf bib
SPRING Lab IITM’s Submission to Low Resource Indic Language Translation Shared Task
Hamees Sayed | Advait Joglekar | Srinivasan Umesh
Proceedings of the Ninth Conference on Machine Translation

We develop a robust translation model for four low-resource Indic languages: Khasi, Mizo, Manipuri, and Assamese. Our approach includes a comprehensive pipeline from data collection and preprocessing to training and evaluation, leveraging data from WMT task datasets, BPCC, PMIndia, and OpenLanguageData. To address the scarcity of bilingual data, we use back-translation techniques on monolingual datasets for Mizo and Khasi, significantly expanding our training corpus. We fine-tune the pre-trained NLLB 3.3B model for Assamese, Mizo, and Manipuri, achieving improved performance over the baseline. For Khasi, which is not supported by the NLLB model, we introduce special tokens and train the model on our Khasi corpus. Our training involves masked language modelling, followed by fine-tuning for English-to-Indic and Indic-to-English translations.