Abstract
This paper describes the LIT Team’s submission to the IWSLT2020 open domain translation task, focusing primarily on Japanese-to-Chinese translation direction. Our system is based on the organizers’ baseline system, but we do more works on improving the Transform baseline system by elaborate data pre-processing. We manage to obtain significant improvements, and this paper aims to share some data processing experiences in this translation task. Large-scale back-translation on monolingual corpus is also investigated. In addition, we also try shared and exclusive word embeddings, compare different granularity of tokens like sub-word level. Our Japanese-to-Chinese translation system achieves a performance of BLEU=34.0 and ranks 2nd among all participating systems.- Anthology ID:
- 2020.iwslt-1.12
- Volume:
- Proceedings of the 17th International Conference on Spoken Language Translation
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Marcello Federico, Alex Waibel, Kevin Knight, Satoshi Nakamura, Hermann Ney, Jan Niehues, Sebastian Stüker, Dekai Wu, Joseph Mariani, Francois Yvon
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 109–113
- Language:
- URL:
- https://aclanthology.org/2020.iwslt-1.12
- DOI:
- 10.18653/v1/2020.iwslt-1.12
- Cite (ACL):
- Yimeng Zhuang, Yuan Zhang, and Lijie Wang. 2020. LIT Team’s System Description for Japanese-Chinese Machine Translation Task in IWSLT 2020. In Proceedings of the 17th International Conference on Spoken Language Translation, pages 109–113, Online. Association for Computational Linguistics.
- Cite (Informal):
- LIT Team’s System Description for Japanese-Chinese Machine Translation Task in IWSLT 2020 (Zhuang et al., IWSLT 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2020.iwslt-1.12.pdf