JU-CSE-NLP’s Cascaded Speech to Text Translation Systems for IWSLT 2025 in Indic Track

Debjit Dhar, Soham Lahiri, Tapabrata Mondal, Sivaji Bandyopadhyay


Abstract
This paper presents the submission of the Jadavpur University Computer Science and Engineering Natural Language Processing (JU-CSENLP) Laboratory to the International Conference on Spoken Language Translation (IWSLT) 2025 Indic track, addressing the speech-to-text translation task in both English-to-Indic (Bengali, Hindi, Tamil) and Indic-to-English directions. To tackle the challenges posed by low resource Indian languages, we adopt a cascaded approach leveraging state-of-the-art pre-trained models. For English-to-Indic translation, we utilize OpenAI’s Whisper model for Automatic Speech Recognition (ASR), followed by the Meta’s No Language Left Behind (NLLB)-200-distilled-600M model finetuned for Machine Translation (MT). For the reverse direction, we employ the AI4Bharat’s IndicConformer model for ASR and IndicTrans2 finetuned for MT. Our models are fine-tuned on the provided benchmark dataset to better handle the linguistic diversity and domain-specific variations inherent in the data. Evaluation results demonstrate that our cascaded systems achieve competitive performance, with notable BLEU and chrF++ scores across all language pairs. Our findings highlight the effectiveness of combining robust ASR and MT components in a cascaded pipeline, particularly for low-resource and morphologically rich Indian languages.
Anthology ID:
2025.iwslt-1.18
Volume:
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
Month:
July
Year:
2025
Address:
Vienna, Austria (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Antonis Anastasopoulos
Venues:
IWSLT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
201–205
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.iwslt-1.18/
DOI:
Bibkey:
Cite (ACL):
Debjit Dhar, Soham Lahiri, Tapabrata Mondal, and Sivaji Bandyopadhyay. 2025. JU-CSE-NLP’s Cascaded Speech to Text Translation Systems for IWSLT 2025 in Indic Track. In Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025), pages 201–205, Vienna, Austria (in-person and online). Association for Computational Linguistics.
Cite (Informal):
JU-CSE-NLP’s Cascaded Speech to Text Translation Systems for IWSLT 2025 in Indic Track (Dhar et al., IWSLT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.iwslt-1.18.pdf