Dmitrii Ulianov


2025

pdf bib
Yandex Submission to the WMT25 General Machine Translation Task
Nikolay Karpachev | Ekaterina Enikeeva | Dmitry Popov | Arsenii Bulgakov | Daniil Panteleev | Dmitrii Ulianov | Artem Kryukov | Artem Mekhraliev
Proceedings of the Tenth Conference on Machine Translation

This paper describes Yandex submission to the WMT25 General Machine Translation task. We participate in English-to-Russian translation direction and propose a purely LLM-based translation model. Our training procedure comprises a training pipeline of several stages built upon YandexGPT, an in-house general-purpose LLM. In particular, firstly, we employ continual pretraining (post-pretrain) for MT task for initial adaptation to multilinguality and translation. Subsequently, we use SFT on parallel document-level corpus in the form of P-Tuning. Following SFT, we propose a novel alignment scheme of two stages, the first one being a curriculum learning with difficulty schedule and a second one - training the model for tag preservation and error correction with human post-edits as training samples. Our model achieves results comparable to human reference translations on multiple domains.