NeMo@IWSLT 2026: Cascaded System for Simultaneous Speech Translation
Lilit Grigoryan, Vladimir Bataev, Andrei Andrusenko, Oleksii Hrinchuk, Davit Karamyan, Enas Albasiri, Vitaly Lavrukhin, Nikolay Karpov, Boris Ginsburg
Abstract
This paper describes the NVIDIA NeMo team’s submission to the IWSLT 2026 Simultaneous Speech Translation (SimulST) tracks. We use a cascaded architecture combining a dual-mode Unified ASR Transducer model with a multilingual Large Language Model (LLM). The ASR is trained to deliver stable transcriptions across wide range of latencies, providing a reliable foundation for high-quality LLM translation. Our submission participates in the English–German, English–Italian, and English–Chinese tasks, in both standard and contextualized settings, as well as the Czech–English standard track, covering both low- and high-latency scenarios. We further analyze how ASR and LLM design choices affect the system’s overall latency and translation quality.- Anthology ID:
- 2026.iwslt-1.23
- Volume:
- Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, USA (in-person and online)
- Editors:
- Elizabeth Salesky, Antonios Anastasopoulos, Matteo Negri, Marcello Federico
- Venues:
- IWSLT | WS
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 204–211
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2026-06/2026.iwslt-1.23/
- DOI:
- 10.18653/v1/2026.iwslt-1.23
- Cite (ACL):
- Lilit Grigoryan, Vladimir Bataev, Andrei Andrusenko, Oleksii Hrinchuk, Davit Karamyan, Enas Albasiri, Vitaly Lavrukhin, Nikolay Karpov, and Boris Ginsburg. 2026. NeMo@IWSLT 2026: Cascaded System for Simultaneous Speech Translation. In Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026), pages 204–211, San Diego, USA (in-person and online). Association for Computational Linguistics.
- Cite (Informal):
- NeMo@IWSLT 2026: Cascaded System for Simultaneous Speech Translation (Grigoryan et al., IWSLT 2026)
- PDF:
- https://preview.aclanthology.org/corrections-2026-06/2026.iwslt-1.23.pdf