RankedCOMET: Elevating a 2022 Baseline to a Top-5 Finish in the WMT 2025 QE Task

Sujal Maharjan, Astha Shrestha


Abstract
This paper presents rankedCOMET, a lightweight per-language-pair calibration applied to the publicly available Unbabel/wmt22-comet-da model that yields a competitive Quality Estimation (QE) system for the WMT 2025 shared task. This approach transforms raw model outputs into per-language average ranks and min–max normalizes those ranks to [0,1], maintaining intra-language ordering while generating consistent numeric ranges across language pairs. Applied to 742,740 test segments and submitted to Codabench, this unsupervised post-processing enhanced the aggregated Pearson correlation on the preliminary snapshot and led to a 5th-place finish. We provide detailed pseudocode, ablations (including a negative ensemble attempt), and a reproducible analysis pipeline providing Pearson, Spearman, and Kendall correlations with bootstrap confidence intervals.
Anthology ID:
2025.wmt-1.74
Volume:
Proceedings of the Tenth Conference on Machine Translation
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
994–998
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.74/
DOI:
Bibkey:
Cite (ACL):
Sujal Maharjan and Astha Shrestha. 2025. RankedCOMET: Elevating a 2022 Baseline to a Top-5 Finish in the WMT 2025 QE Task. In Proceedings of the Tenth Conference on Machine Translation, pages 994–998, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
RankedCOMET: Elevating a 2022 Baseline to a Top-5 Finish in the WMT 2025 QE Task (Maharjan & Shrestha, WMT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.74.pdf