@inproceedings{wu-monz-2025-uva,
    title = "{U}v{A}-{MT} at {WMT}25 Evaluation Task: {LLM} Uncertainty as a Proxy for Translation Quality",
    author = "Wu, Di  and
      Monz, Christof",
    editor = "Haddow, Barry  and
      Kocmi, Tom  and
      Koehn, Philipp  and
      Monz, Christof",
    booktitle = "Proceedings of the Tenth Conference on Machine Translation",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.72/",
    pages = "974--983",
    ISBN = "979-8-89176-341-8",
    abstract = "This year, we focus exclusively on using the uncertainty quantification as a proxy for translation quality. While this has traditionally been regarded as a form of unsupervised quality estimation, such signals have been overlooked in the design of the current metric models{---}we show their value in the context of LLMs. More specifically, in contrast to conventional unsupervised QE methods, we apply recent calibration technology to adjust translation likelihoods to better align with quality signals, and we use the single resulting model to participate in both the general translation and QE tracks at WMT25.Our offline experiments show some advantages: 1) uncertainty signals extracted from LLMs, like Tower or Gemma-3, provide accurate quality predictions; and 2) calibration technology further improves this QE performance, sometimes even surpassing certain metric models that were trained with human annotations, such as CometKiwi. We therefore argue that uncertainty quantification (confidence), especially from LLMs, can serve as a strong and complementary signal for the metric design, particularly when human-annotated data are lacking. However, we also identify limitations, such as its tendency to assign disproportionately higher scores to hypotheses generated by the model itself."
}Markdown (Informal)
[UvA-MT at WMT25 Evaluation Task: LLM Uncertainty as a Proxy for Translation Quality](https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.72/) (Wu & Monz, WMT 2025)
ACL