An Empirical Study of Machine Translation for the Shared Task of WMT18
Chao Bei, Hao Zong, Yiming Wang, Baoyong Fan, Shiqi Li, Conghu Yuan
Abstract
This paper describes the Global Tone Communication Co., Ltd.’s submission of the WMT18 shared news translation task. We participated in the English-to-Chinese direction and get the best BLEU (43.8) scores among all the participants. The submitted system focus on data clearing and techniques to build a competitive model for this task. Unlike other participants, the submitted system are mainly relied on the data filtering to obtain the best BLEU score. We do data filtering not only for provided sentences but also for the back translated sentences. The techniques we apply for data filtering include filtering by rules, language models and translation models. We also conduct several experiments to validate the effectiveness of training techniques. According to our experiments, the Annealing Adam optimizing function and ensemble decoding are the most effective techniques for the model training.- Anthology ID:
- W18-6404
- Volume:
- Proceedings of the Third Conference on Machine Translation: Shared Task Papers
- Month:
- October
- Year:
- 2018
- Address:
- Belgium, Brussels
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 340–344
- Language:
- URL:
- https://aclanthology.org/W18-6404
- DOI:
- 10.18653/v1/W18-6404
- Cite (ACL):
- Chao Bei, Hao Zong, Yiming Wang, Baoyong Fan, Shiqi Li, and Conghu Yuan. 2018. An Empirical Study of Machine Translation for the Shared Task of WMT18. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 340–344, Belgium, Brussels. Association for Computational Linguistics.
- Cite (Informal):
- An Empirical Study of Machine Translation for the Shared Task of WMT18 (Bei et al., WMT 2018)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/W18-6404.pdf