Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models

Minghan Wang, Thuy-Trang Vu, Yuxia Wang, Ehsan Shareghi, Gholamreza Haffari


Abstract
Simultaneous machine translation (SimulMT) presents a challenging trade-off between translation quality and latency. Recent studies have shown that LLMs can achieve good performance in SimulMT tasks. However, this often comes at the expense of high inference costs and latency. In this paper, we propose a conversational SimulMT framework to enhance the inference efficiency of LLM-based SimulMT through multi-turn-dialogue-based decoding where source and target chunks interleave in translation history, enabling the reuse of Key-Value cache. To adapt LLMs to the proposed conversational decoding, we create supervised fine-tuning training data by segmenting parallel sentences using an alignment tool and a novel augmentation technique to enhance generalization. Our experiments with Llama2-7b-chat on three SimulMT benchmarks demonstrate that the proposed method empowers the superiority of LLM in translation quality, meanwhile achieving comparable computational latency with specialized SimulMT models.
Anthology ID:
2025.iwslt-1.8
Volume:
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
Month:
July
Year:
2025
Address:
Vienna, Austria (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Antonis Anastasopoulos
Venues:
IWSLT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
93–105
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.iwslt-1.8/
DOI:
Bibkey:
Cite (ACL):
Minghan Wang, Thuy-Trang Vu, Yuxia Wang, Ehsan Shareghi, and Gholamreza Haffari. 2025. Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models. In Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025), pages 93–105, Vienna, Austria (in-person and online). Association for Computational Linguistics.
Cite (Informal):
Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models (Wang et al., IWSLT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.iwslt-1.8.pdf