RankedCOMET: Elevating a 2022 Baseline to a Top-5 Finish in the WMT 2025 QE Task

Sujal Maharjan; Astha Shrestha

RankedCOMET: Elevating a 2022 Baseline to a Top-5 Finish in the WMT 2025 QE Task

Abstract

This paper presents rankedCOMET, a lightweight per-language-pair calibration applied to the publicly available Unbabel/wmt22-comet-da model that yields a competitive Quality Estimation (QE) system for the WMT 2025 shared task. This approach transforms raw model outputs into per-language average ranks and min–max normalizes those ranks to [0,1], maintaining intra-language ordering while generating consistent numeric ranges across language pairs. Applied to 742,740 test segments and submitted to Codabench, this unsupervised post-processing enhanced the aggregated Pearson correlation on the preliminary snapshot and led to a 5th-place finish. We provide detailed pseudocode, ablations (including a negative ensemble attempt), and a reproducible analysis pipeline providing Pearson, Spearman, and Kendall correlations with bootstrap confidence intervals.

Anthology ID:: 2025.wmt-1.74
Volume:: Proceedings of the Tenth Conference on Machine Translation
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:: WMT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 994–998
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.74/
DOI:
Bibkey:
Cite (ACL):: Sujal Maharjan and Astha Shrestha. 2025. RankedCOMET: Elevating a 2022 Baseline to a Top-5 Finish in the WMT 2025 QE Task. In Proceedings of the Tenth Conference on Machine Translation, pages 994–998, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: RankedCOMET: Elevating a 2022 Baseline to a Top-5 Finish in the WMT 2025 QE Task (Maharjan & Shrestha, WMT 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.74.pdf

PDF Cite Search Fix data