Abstract
We present our experiments in the scope of the news translation task in WMT 2018, in directions: English→German. The core of our systems is the encoder-decoder based neural machine translation models using the transformer architecture. We enhanced the model with a deeper architecture. By using techniques to limit the memory consumption, we were able to train models that are 4 times larger on one GPU and improve the performance by 1.2 BLEU points. Furthermore, we performed sentence selection for the newly available ParaCrawl corpus. Thereby, we could improve the effectiveness of the corpus by 0.5 BLEU points.- Anthology ID:
- W18-6422
- Volume:
- Proceedings of the Third Conference on Machine Translation: Shared Task Papers
- Month:
- October
- Year:
- 2018
- Address:
- Belgium, Brussels
- Editors:
- Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 467–472
- Language:
- URL:
- https://aclanthology.org/W18-6422
- DOI:
- 10.18653/v1/W18-6422
- Cite (ACL):
- Ngoc-Quan Pham, Jan Niehues, and Alexander Waibel. 2018. The Karlsruhe Institute of Technology Systems for the News Translation Task in WMT 2018. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 467–472, Belgium, Brussels. Association for Computational Linguistics.
- Cite (Informal):
- The Karlsruhe Institute of Technology Systems for the News Translation Task in WMT 2018 (Pham et al., WMT 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/W18-6422.pdf