LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline

Biao Fu (付彪); Minpeng Liao; Kai Fan; Chengxi Li; Liang Zhang; Yidong Chen (陈毅东); Xiaodong Shi (史晓东)

LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline

Biao Fu, Minpeng Liao, Kai Fan, Chengxi Li, Liang Zhang, Yidong Chen, Xiaodong Shi

Abstract

When the complete source sentence is provided, Large Language Models (LLMs) perform excellently in offline machine translation even with a simple prompt “Translate the following sentence from [src lang] into [tgt lang]:”. However, in many real scenarios, the source tokens arrive in a streaming manner and simultaneous machine translation (SiMT) is required, then the efficiency and performance of decoder-only LLMs are significantly limited by their auto-regressive nature. To enable LLMs to achieve high-quality SiMT as efficiently as offline translation, we propose a novel paradigm that includes constructing supervised fine-tuning (SFT) data for SiMT, along with new training and inference strategies. To replicate the token input/output stream in SiMT, the source and target tokens are rearranged into an interleaved sequence, separated by special tokens according to varying latency requirements. This enables powerful LLMs to learn read and write operations adaptively, based on varying latency prompts, while still maintaining efficient auto-regressive decoding. Experimental results show that, even with limited SFT data, our approach achieves state-of-the-art performance across various SiMT benchmarks and different evaluation metrics, and preserves the original capabilities of offline translation. Moreover, our approach generalizes well to document-level SiMT setting without requiring specific fine-tuning, even beyond the offline translation model.

Anthology ID:: 2025.findings-acl.1045
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 20372–20395
Language:
URL:: https://preview.aclanthology.org/landing_page/2025.findings-acl.1045/
DOI:
Bibkey:
Cite (ACL):: Biao Fu, Minpeng Liao, Kai Fan, Chengxi Li, Liang Zhang, Yidong Chen, and Xiaodong Shi. 2025. LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline. In Findings of the Association for Computational Linguistics: ACL 2025, pages 20372–20395, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline (Fu et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2025.findings-acl.1045.pdf

PDF Cite Search Fix data