@inproceedings{park-etal-2021-papagos,
    title = "Papago{'}s Submissions to the {WMT}21 Triangular Translation Task",
    author = "Park, Jeonghyeok  and
      Kim, Hyunjoong  and
      Cho, Hyunchang",
    editor = "Barrault, Loic  and
      Bojar, Ondrej  and
      Bougares, Fethi  and
      Chatterjee, Rajen  and
      Costa-jussa, Marta R.  and
      Federmann, Christian  and
      Fishel, Mark  and
      Fraser, Alexander  and
      Freitag, Markus  and
      Graham, Yvette  and
      Grundkiewicz, Roman  and
      Guzman, Paco  and
      Haddow, Barry  and
      Huck, Matthias  and
      Yepes, Antonio Jimeno  and
      Koehn, Philipp  and
      Kocmi, Tom  and
      Martins, Andre  and
      Morishita, Makoto  and
      Monz, Christof",
    booktitle = "Proceedings of the Sixth Conference on Machine Translation",
    month = nov,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2021.wmt-1.40/",
    pages = "341--346",
    abstract = "This paper describes Naver Papago{'}s submission to the WMT21 shared triangular MT task to enhance the non-English MT system with tri-language parallel data. The provided parallel data are Russian-Chinese (direct), Russian-English (indirect), and English-Chinese (indirect) data. This task aims to improve the quality of the Russian-to-Chinese MT system by exploiting the direct and indirect parallel re- sources. The direct parallel data is noisy data crawled from the web. To alleviate the issue, we conduct extensive experiments to find effective data filtering methods. With the empirical knowledge that the performance of bilingual MT is better than multi-lingual MT and related experiment results, we approach this task as bilingual MT, where the two indirect data are transformed to direct data. In addition, we use the Transformer, a robust translation model, as our baseline and integrate several techniques, averaging checkpoints, model ensemble, and re-ranking. Our final system provides a 12.7 BLEU points improvement over a baseline system on the WMT21 triangular MT development set. In the official evalua- tion of the test set, ours is ranked 2nd in terms of BLEU scores."
}