Findings of the WMT 2025 Shared Task of the Open Language Data Initiative

David Dale, Laurie Burchell, Jean Maillard, Idris Abdulmumin, Antonios Anastasopoulos, Isaac Caswell, Philipp Koehn


Abstract
We present the results of the WMT 2025 shared task of the Open Language Data Initiative. Participants were invited to contribute to the massively multilingual open datasets (FLORES+, MT Seed, WMT24++) or create new such resources. We accepted 8 submissions, including 7 extensions or revisions of the existing datasets and one submission with a new parallel training dataset, SMOL.
Anthology ID:
2025.wmt-1.26
Volume:
Proceedings of the Tenth Conference on Machine Translation
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
495–502
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.26/
DOI:
Bibkey:
Cite (ACL):
David Dale, Laurie Burchell, Jean Maillard, Idris Abdulmumin, Antonios Anastasopoulos, Isaac Caswell, and Philipp Koehn. 2025. Findings of the WMT 2025 Shared Task of the Open Language Data Initiative. In Proceedings of the Tenth Conference on Machine Translation, pages 495–502, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Findings of the WMT 2025 Shared Task of the Open Language Data Initiative (Dale et al., WMT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.26.pdf