Arsenii Bulgakov
2025
Yandex Submission to the WMT25 General Machine Translation Task
Nikolay Karpachev
|
Ekaterina Enikeeva
|
Dmitry Popov
|
Arsenii Bulgakov
|
Daniil Panteleev
|
Dmitrii Ulianov
|
Artem Kryukov
|
Artem Mekhraliev
Proceedings of the Tenth Conference on Machine Translation
This paper describes Yandex submission to the WMT25 General Machine Translation task. We participate in English-to-Russian translation direction and propose a purely LLM-based translation model. Our training procedure comprises a training pipeline of several stages built upon YandexGPT, an in-house general-purpose LLM. In particular, firstly, we employ continual pretraining (post-pretrain) for MT task for initial adaptation to multilinguality and translation. Subsequently, we use SFT on parallel document-level corpus in the form of P-Tuning. Following SFT, we propose a novel alignment scheme of two stages, the first one being a curriculum learning with difficulty schedule and a second one - training the model for tag preservation and error correction with human post-edits as training samples. Our model achieves results comparable to human reference translations on multiple domains.
Search
Fix author
Co-authors
- Ekaterina Enikeeva 1
- Nikolay Karpachev 1
- Artem Kryukov 1
- Artem Mekhraliev 1
- Daniil Panteleev 1
- show all...
Venues
- wmt1