Abstract
This paper describes our submission to the WMT 2019 Chinese-English (zh-en) news translation shared task. Our systems are based on RNN architectures with pre-trained embeddings which utilize character and sub-character information. We compare models with these different granularity levels using different evaluating metics. We find that a finer granularity embeddings can help the model according to character level evaluation and that the pre-trained embeddings can also be beneficial for model performance marginally when the training data is limited.- Anthology ID:
- W19-5324
- Volume:
- Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 249–256
- Language:
- URL:
- https://aclanthology.org/W19-5324
- DOI:
- 10.18653/v1/W19-5324
- Cite (ACL):
- Zhenhao Li and Lucia Specia. 2019. A Comparison on Fine-grained Pre-trained Embeddings for the WMT19Chinese-English News Translation Task. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 249–256, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- A Comparison on Fine-grained Pre-trained Embeddings for the WMT19Chinese-English News Translation Task (Li & Specia, WMT 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/W19-5324.pdf