Improving Sequence to Sequence Neural Machine Translation by Utilizing Syntactic Dependency Information

An Nguyen Le, Ander Martinez, Akifumi Yoshimoto, Yuji Matsumoto


Abstract
Sequence to Sequence Neural Machine Translation has achieved significant performance in recent years. Yet, there are some existing issues that Neural Machine Translation still does not solve completely. Two of them are translation for long sentences and the “over-translation”. To address these two problems, we propose an approach that utilize more grammatical information such as syntactic dependencies, so that the output can be generated based on more abundant information. In our approach, syntactic dependencies is employed in decoding. In addition, the output of the model is presented not as a simple sequence of tokens but as a linearized tree construction. In order to assess the performance, we construct model based on an attention mechanism encoder-decoder model in which the source language is input to the encoder as a sequence and the decoder generates the target language as a linearized dependency tree structure. Experiments on the Europarl-v7 dataset of French-to-English translation demonstrate that our proposed method improves BLEU scores by 1.57 and 2.40 on datasets consisting of sentences with up to 50 and 80 tokens, respectively. Furthermore, the proposed method also solved the two existing problems, ineffective translation for long sentences and over-translation in Neural Machine Translation.
Anthology ID:
I17-1003
Volume:
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
November
Year:
2017
Address:
Taipei, Taiwan
Venue:
IJCNLP
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
21–29
Language:
URL:
https://aclanthology.org/I17-1003
DOI:
Bibkey:
Cite (ACL):
An Nguyen Le, Ander Martinez, Akifumi Yoshimoto, and Yuji Matsumoto. 2017. Improving Sequence to Sequence Neural Machine Translation by Utilizing Syntactic Dependency Information. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 21–29, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):
Improving Sequence to Sequence Neural Machine Translation by Utilizing Syntactic Dependency Information (Nguyen Le et al., IJCNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/I17-1003.pdf
Dataset:
 I17-1003.Datasets.zip