Effectively combining Phi-4 and NLLB for Spoken Language Translation: SPRING Lab IITM’s submission to Low Resource Multilingual Indic Track

Sankalpa Sarkar, Samriddhi Kashyap, Advait Joglekar, Srinivasan Umesh


Abstract
This paper presents the methodologies implemented for Spoken Language Translation for the language pairs Hindi-English, Bengali-English and Tamil-English for the Low Resource Multilingual Indic Track of The International Conference on Spoken Language Translation (IWSLT) for 2025. We adopt a cascaded approach and use a fine-tuned Phi-4 multimodal instruct model for Automatic Speech Recognition(ASR) and a fine-tuned NLLB model for Machine Translation(MT).
Anthology ID:
2025.iwslt-1.42
Volume:
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
Month:
July
Year:
2025
Address:
Vienna, Austria (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Antonis Anastasopoulos
Venues:
IWSLT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
399–404
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.iwslt-1.42/
DOI:
Bibkey:
Cite (ACL):
Sankalpa Sarkar, Samriddhi Kashyap, Advait Joglekar, and Srinivasan Umesh. 2025. Effectively combining Phi-4 and NLLB for Spoken Language Translation: SPRING Lab IITM’s submission to Low Resource Multilingual Indic Track. In Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025), pages 399–404, Vienna, Austria (in-person and online). Association for Computational Linguistics.
Cite (Informal):
Effectively combining Phi-4 and NLLB for Spoken Language Translation: SPRING Lab IITM’s submission to Low Resource Multilingual Indic Track (Sarkar et al., IWSLT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.iwslt-1.42.pdf