AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task

Quentin Fuxa; Dominik Macháček

doi:10.18653/v1/2026.iwslt-1.32

AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task

Abstract

We describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. The system is a synchronous cascade: Qwen3-ASR with forced alignment produces an incrementally updated source transcript, and Gemma-4 E4B-it translates that prefix under an MT-side AlignAtt policy. To our knowledge, this is the first application of AlignAtt to a decoder-only LLM, where the encoder-decoder cross-attention used by earlier AlignAtt systems is absent. We recover a usable policy by proposing (1) an explicit source span in the prompt, (2) offline selection of translation-specific alignment heads, (3) selective qk-fast replay of the draft-to-source attention block, and (4) runtime query/key capture that preserves model outputs bit-identically. On the IWSLT 2026 development set, AlignAtt4LLM outperforms the supplied baselines for the European target languages, English to German and English to Italian, in both the low-latency regime around 2 seconds and the high-latency regime below 4 seconds CU-LongYAAL. Results for English to Chinese are more mixed, but the method is not tied to Gemma-4: because AlignAtt4LLM only requires a deterministic prompt layout, calibrated attention heads, and query/key capture, the same policy can be reapplied to stronger translation-focused decoder-only MT backbones for non-European target languages.

Anthology ID:: 2026.iwslt-1.32
Volume:: Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)
Month:: July
Year:: 2026
Address:: San Diego, USA (in-person and online)
Editors:: Elizabeth Salesky, Antonios Anastasopoulos, Matteo Negri, Marcello Federico
Venues:: IWSLT | WS
SIG:: SIGSLT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 284–295
Language:
URL:: https://preview.aclanthology.org/corrections-2026-06/2026.iwslt-1.32/
DOI:: 10.18653/v1/2026.iwslt-1.32
Bibkey:
Cite (ACL):: Quentin Fuxa and Dominik Macháček. 2026. AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task. In Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026), pages 284–295, San Diego, USA (in-person and online). Association for Computational Linguistics.
Cite (Informal):: AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task (Fuxa & Macháček, IWSLT 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/corrections-2026-06/2026.iwslt-1.32.pdf

PDF Cite Search Fix data