What Does LLM Refinement Actually Improve? A Systematic Study on Document-Level Literary Translation

Shaomu Tan; Dawei Zhu; Ke M. Tran; Michael Denkowski; Sony Trenous; Leonardo F. R. Ribeiro; Bill Byrne; Felix Hieber

What Does LLM Refinement Actually Improve? A Systematic Study on Document-Level Literary Translation

Shaomu Tan, Dawei Zhu, Ke Tran, Michael Denkowski, Sony Trenous, Leonardo F. R. Ribeiro, Bill Byrne, Felix Hieber

Abstract

Iterative refinement is a simple inference-time strategy for machine translation: given an initial translation, an LLM revises it without additional training. Yet document-scale refinement remains poorly understood: 1) which pipelines work best, 2) what quality dimensions improve, and 3) how refiners behave. In this paper, we present a systematic study of document-level literary translation, covering six LLMs and seven language pairs. Across nine translation-refinement granularity combinations and five refinement strategies, a) we find a robust recipe: document-level MT followed by segment-level refinement yields the strongest and most stable improvements. In our setting, doc-level refinement often makes fewer edits and leads to smaller or less reliable gains. Surprisingly, a simple general refinement prompt consistently outperforms error-specific prompting and evaluate-then-refine schemes. b) Fine-grained MQM analyses and professional-translator evaluation show that gains come primarily from fluency, with limited improvements in adequacy. c) Probing translator-refiner strength interactions suggests refinement behaves less like targeted post-editing and more like projecting outputs toward the refiner’s learned distribution while remaining anchored to the initial translation.

Anthology ID:: 2026.acl-long.268
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5929–5957
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.268/
DOI:
Bibkey:
Cite (ACL):: Shaomu Tan, Dawei Zhu, Ke Tran, Michael Denkowski, Sony Trenous, Leonardo F. R. Ribeiro, Bill Byrne, and Felix Hieber. 2026. What Does LLM Refinement Actually Improve? A Systematic Study on Document-Level Literary Translation. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5929–5957, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: What Does LLM Refinement Actually Improve? A Systematic Study on Document-Level Literary Translation (Tan et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.268.pdf
Checklist:: 2026.acl-long.268.checklist.pdf

PDF Cite Search Checklist Fix data