Fully Character-Level Neural Machine Translation without Explicit Segmentation

Jason Lee; Kyunghyun Cho; Thomas Hofmann

doi:10.1162/tacl_a_00067

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Jason Lee, Kyunghyun Cho, Thomas Hofmann

Abstract

Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural machine translation (NMT) model that maps a source character sequence to a target character sequence without any segmentation. We employ a character-level convolutional network with max-pooling at the encoder to reduce the length of source representation, allowing the model to be trained at a speed comparable to subword-level models while capturing local regularities. Our character-to-character model outperforms a recently proposed baseline with a subword-level encoder on WMT’15 DE-EN and CS-EN, and gives comparable performance on FI-EN and RU-EN. We then demonstrate that it is possible to share a single character-level encoder across multiple languages by training a model on a many-to-one translation task. In this multilingual setting, the character-level encoder significantly outperforms the subword-level encoder on all the language pairs. We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of the BLEU score and human judgment.

Anthology ID:: Q17-1026
Volume:: Transactions of the Association for Computational Linguistics, Volume 5
Month:
Year:: 2017
Address:: Cambridge, MA
Editors:: Lillian Lee, Mark Johnson, Kristina Toutanova
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 365–378
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/Q17-1026/
DOI:: 10.1162/tacl_a_00067
Bibkey:
Cite (ACL):: Jason Lee, Kyunghyun Cho, and Thomas Hofmann. 2017. Fully Character-Level Neural Machine Translation without Explicit Segmentation. Transactions of the Association for Computational Linguistics, 5:365–378.
Cite (Informal):: Fully Character-Level Neural Machine Translation without Explicit Segmentation (Lee et al., TACL 2017)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/Q17-1026.pdf
Video:: https://preview.aclanthology.org/jlcl-multiple-ingestion/Q17-1026.mp4
Code: nyu-dl/dl4mt-c2c + additional community code

PDF Cite Search Code Video Fix data