A Comparison on Fine-grained Pre-trained Embeddings for the WMT19Chinese-English News Translation Task

Zhenhao Li, Lucia Specia


Abstract
This paper describes our submission to the WMT 2019 Chinese-English (zh-en) news translation shared task. Our systems are based on RNN architectures with pre-trained embeddings which utilize character and sub-character information. We compare models with these different granularity levels using different evaluating metics. We find that a finer granularity embeddings can help the model according to character level evaluation and that the pre-trained embeddings can also be beneficial for model performance marginally when the training data is limited.
Anthology ID:
W19-5324
Volume:
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Month:
August
Year:
2019
Address:
Florence, Italy
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
249–256
Language:
URL:
https://aclanthology.org/W19-5324
DOI:
10.18653/v1/W19-5324
Bibkey:
Cite (ACL):
Zhenhao Li and Lucia Specia. 2019. A Comparison on Fine-grained Pre-trained Embeddings for the WMT19Chinese-English News Translation Task. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 249–256, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
A Comparison on Fine-grained Pre-trained Embeddings for the WMT19Chinese-English News Translation Task (Li & Specia, WMT 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/W19-5324.pdf