Kai Hu
2022
The VolcTrans System for WMT22 Multilingual Machine Translation Task
Xian Qian
|
Kai Hu
|
Jiaqiang Wang
|
Yifeng Liu
|
Xingyuan Pan
|
Jun Cao
|
Mingxuan Wang
Proceedings of the Seventh Conference on Machine Translation (WMT)
This report describes our VolcTrans system for the WMT22 shared task on large-scale multilingual machine translation. We participated in the unconstrained track which allows the use of external resources. Our system is a transformer-based multilingual model trained on data from multiple sources including the public training set from the data track, NLLB data provided by Meta AI, self-collected parallel corpora, and pseudo bitext from back-translation. Both bilingual and monolingual texts are cleaned by a series of heuristic rules. On the official test set, our system achieves $17.3$ BLEU, $21.9$ spBLEU, and $41.9$ chrF2++ on average over all language pairs. Averaged inference speed is $11.5$ sentences per second using a single Nvidia Tesla V100 GPU.
Search
Co-authors
- Xian Qian 1
- Jiaqiang Wang 1
- Yifeng Liu 1
- Xingyuan Pan 1
- Jun Cao 1
- show all...
Venues
- wmt1