A Comparison on Fine-grained Pre-trained Embeddings for the WMT19Chinese-English News Translation Task

Zhenhao Li, Lucia Specia


Abstract
This paper describes our submission to the WMT 2019 Chinese-English (zh-en) news translation shared task. Our systems are based on RNN architectures with pre-trained embeddings which utilize character and sub-character information. We compare models with these different granularity levels using different evaluating metics. We find that a finer granularity embeddings can help the model according to character level evaluation and that the pre-trained embeddings can also be beneficial for model performance marginally when the training data is limited.
Anthology ID:
W19-5324
Volume:
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
249–256
Language:
URL:
https://aclanthology.org/W19-5324
DOI:
10.18653/v1/W19-5324
Bibkey:
Cite (ACL):
Zhenhao Li and Lucia Specia. 2019. A Comparison on Fine-grained Pre-trained Embeddings for the WMT19Chinese-English News Translation Task. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 249–256, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
A Comparison on Fine-grained Pre-trained Embeddings for the WMT19Chinese-English News Translation Task (Li & Specia, WMT 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/W19-5324.pdf