Jiaqiang Wang
2022
The VolcTrans System for WMT22 Multilingual Machine Translation Task
Xian Qian
|
Kai Hu
|
Jiaqiang Wang
|
Yifeng Liu
|
Xingyuan Pan
|
Jun Cao
|
Mingxuan Wang
Proceedings of the Seventh Conference on Machine Translation (WMT)
This report describes our VolcTrans system for the WMT22 shared task on large-scale multilingual machine translation. We participated in the unconstrained track which allows the use of external resources. Our system is a transformer-based multilingual model trained on data from multiple sources including the public training set from the data track, NLLB data provided by Meta AI, self-collected parallel corpora, and pseudo bitext from back-translation. Both bilingual and monolingual texts are cleaned by a series of heuristic rules. On the official test set, our system achieves 17.3 BLEU, 21.9 spBLEU, and 41.9 chrF2++ on average over all language pairs. Averaged inference speed is 11.5 sentences per second using a single Nvidia Tesla V100 GPU.
Search
Co-authors
- Jun Cao 1
- Kai Hu 1
- Mingxuan Wang 1
- Xian Qian 1
- Xingyuan Pan 1
- show all...
Venues
- wmt1