Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

Wafaa Mohammed, Vlad Niculae, Chrysoula Zerva


Abstract
Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they still struggle to adequately handle discourse phenomena, such as pronoun resolution and lexical cohesion at the document level. In this study, we thoroughly investigate the discourse phenomena performance of LLMs in context-aware translation. We demonstrate that discourse knowledge is encoded within LLMs and propose the use of quality-aware decoding (QAD), specifically minimum Bayes risk decoding, to effectively extract this knowledge, showcasing its superiority over other decoding approaches through comprehensive analysis. Furthermore, we illustrate that QAD enhances the semantic richness of translations and aligns them more closely with human preferences.
Anthology ID:
2026.eacl-long.220
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4752–4774
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.220/
DOI:
Bibkey:
Cite (ACL):
Wafaa Mohammed, Vlad Niculae, and Chrysoula Zerva. 2026. Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4752–4774, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding (Mohammed et al., EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.220.pdf