CUNI-NL@IWSLT 2025: End-to-end Offline Speech Translation and Instruction Following with LLMs

Nam Luu, Ondřej Bojar


Abstract
This paper describes the CUNI-NL team’s submission to the IWSLT 2025 Offline Speech Translation and Instruction Following tasks, focusing on transcribing the English audio, and translating the English audio to German text. Our systems follow the end-to-end approach, where each system consists of a pretrained, frozen speech encoder, along with a medium-sized large language model fine-tuned with LoRA on three tasks: 1) transcribing the English audio; 2) directly translating the English audio to German text; and 3) a combination of the above two tasks, i.e. simultaneously transcribing the English audio and translating the English audio to German text.
Anthology ID:
2025.iwslt-1.28
Volume:
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
Month:
July
Year:
2025
Address:
Vienna, Austria (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Antonis Anastasopoulos
Venues:
IWSLT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
282–288
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.iwslt-1.28/
DOI:
Bibkey:
Cite (ACL):
Nam Luu and Ondřej Bojar. 2025. CUNI-NL@IWSLT 2025: End-to-end Offline Speech Translation and Instruction Following with LLMs. In Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025), pages 282–288, Vienna, Austria (in-person and online). Association for Computational Linguistics.
Cite (Informal):
CUNI-NL@IWSLT 2025: End-to-end Offline Speech Translation and Instruction Following with LLMs (Luu & Bojar, IWSLT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.iwslt-1.28.pdf