Abstract
Soft-attention based Neural Machine Translation (NMT) models have achieved promising results on several translation tasks. These models attend all the words in the source sequence for each target token, which makes them ineffective for long sequence translation. In this work, we propose a hard-attention based NMT model which selects a subset of source tokens for each target token to effectively handle long sequence translation. Due to the discrete nature of the hard-attention mechanism, we design a reinforcement learning algorithm coupled with reward shaping strategy to efficiently train it. Experimental results show that the proposed model performs better on long sequences and thereby achieves significant BLEU score improvement on English-German (EN-DE) and English-French (ENFR) translation tasks compared to the soft attention based NMT.- Anthology ID:
- P19-1290
- Volume:
- Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Anna Korhonen, David Traum, Lluís Màrquez
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3037–3043
- Language:
- URL:
- https://aclanthology.org/P19-1290
- DOI:
- 10.18653/v1/P19-1290
- Cite (ACL):
- Sathish Reddy Indurthi, Insoo Chung, and Sangha Kim. 2019. Look Harder: A Neural Machine Translation Model with Hard Attention. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3037–3043, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Look Harder: A Neural Machine Translation Model with Hard Attention (Indurthi et al., ACL 2019)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/P19-1290.pdf