Quality-Aware Decoding: Unifying Quality Estimation and Decoding

Sai Koneru, Matthias Huck, Miriam Exel, Jan Niehues


Abstract
Quality Estimation (QE) models for Neural Machine Translation (NMT) predict the quality of the hypothesis without having access to the reference. An emerging research direction in NMT involves the use of QE models, which have demonstrated high correlations with human judgment and can enhance translations through Quality-Aware Decoding. Although several approaches have been proposed based on sampling multiple candidate translations and picking the best candidate, none have integrated these models directly into the decoding process. In this paper, we address this by proposing a novel token-level QE model capable of reliably scoring partial translations. We build a uni-directional QE model for this, as decoder models are inherently trained and efficient on partial sequences. We then present a decoding strategy that integrates the QE model for Quality-Aware decoding and demonstrate that the translation quality improves when compared to the N-best list re-ranking with state-of-the-art QE models (up to 1.39 XCOMET-XXL). Finally, we show that our approach provides significant benefits in document translation tasks, where the quality of N-best lists is typically suboptimal. Code can be found at https://github.com/SAP-samples/quality-aware-decoding-translation.
Anthology ID:
2025.iwslt-1.3
Volume:
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
Month:
July
Year:
2025
Address:
Vienna, Austria (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Antonis Anastasopoulos
Venues:
IWSLT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
33–46
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.iwslt-1.3/
DOI:
Bibkey:
Cite (ACL):
Sai Koneru, Matthias Huck, Miriam Exel, and Jan Niehues. 2025. Quality-Aware Decoding: Unifying Quality Estimation and Decoding. In Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025), pages 33–46, Vienna, Austria (in-person and online). Association for Computational Linguistics.
Cite (Informal):
Quality-Aware Decoding: Unifying Quality Estimation and Decoding (Koneru et al., IWSLT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.iwslt-1.3.pdf