Look Harder: A Neural Machine Translation Model with Hard Attention

Sathish Reddy Indurthi, Insoo Chung, Sangha Kim


Abstract
Soft-attention based Neural Machine Translation (NMT) models have achieved promising results on several translation tasks. These models attend all the words in the source sequence for each target token, which makes them ineffective for long sequence translation. In this work, we propose a hard-attention based NMT model which selects a subset of source tokens for each target token to effectively handle long sequence translation. Due to the discrete nature of the hard-attention mechanism, we design a reinforcement learning algorithm coupled with reward shaping strategy to efficiently train it. Experimental results show that the proposed model performs better on long sequences and thereby achieves significant BLEU score improvement on English-German (EN-DE) and English-French (ENFR) translation tasks compared to the soft attention based NMT.
Anthology ID:
P19-1290
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3037–3043
Language:
URL:
https://aclanthology.org/P19-1290
DOI:
10.18653/v1/P19-1290
Bibkey:
Cite (ACL):
Sathish Reddy Indurthi, Insoo Chung, and Sangha Kim. 2019. Look Harder: A Neural Machine Translation Model with Hard Attention. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3037–3043, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Look Harder: A Neural Machine Translation Model with Hard Attention (Indurthi et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/P19-1290.pdf