Chunk-Based Bi-Scale Decoder for Neural Machine Translation

Hao Zhou, Zhaopeng Tu, Shujian Huang, Xiaohua Liu, Hang Li, Jiajun Chen


Abstract
In typical neural machine translation (NMT), the decoder generates a sentence word by word, packing all linguistic granularities in the same time-scale of RNN. In this paper, we propose a new type of decoder for NMT, which splits the decode state into two parts and updates them in two different time-scales. Specifically, we first predict a chunk time-scale state for phrasal modeling, on top of which multiple word time-scale states are generated. In this way, the target sentence is translated hierarchically from chunks to words, with information in different granularities being leveraged. Experiments show that our proposed model significantly improves the translation performance over the state-of-the-art NMT model.
Anthology ID:
P17-2092
Volume:
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2017
Address:
Vancouver, Canada
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
580–586
Language:
URL:
https://aclanthology.org/P17-2092
DOI:
10.18653/v1/P17-2092
Bibkey:
Cite (ACL):
Hao Zhou, Zhaopeng Tu, Shujian Huang, Xiaohua Liu, Hang Li, and Jiajun Chen. 2017. Chunk-Based Bi-Scale Decoder for Neural Machine Translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 580–586, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Chunk-Based Bi-Scale Decoder for Neural Machine Translation (Zhou et al., ACL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/starsem-semeval-split/P17-2092.pdf
Code
 zhouh/chunk-nmt