Abstract
In neural machine translation, an attention model is used to identify the aligned source words for a target word (target foresight word) in order to select translation context, but it does not make use of any information of this target foresight word at all. Previous work proposed an approach to improve the attention model by explicitly accessing this target foresight word and demonstrated the substantial gains in alignment task. However, this approach is useless in machine translation task on which the target foresight word is unavailable. In this paper, we propose a new attention model enhanced by the implicit information of target foresight word oriented to both alignment and translation tasks. Empirical experiments on Chinese-to-English and Japanese-to-English datasets show that the proposed attention model delivers significant improvements in terms of both alignment error rate and BLEU.- Anthology ID:
- N18-1125
- Volume:
- Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Editors:
- Marilyn Walker, Heng Ji, Amanda Stent
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1380–1390
- Language:
- URL:
- https://aclanthology.org/N18-1125
- DOI:
- 10.18653/v1/N18-1125
- Cite (ACL):
- Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, and Max Meng. 2018. Target Foresight Based Attention for Neural Machine Translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1380–1390, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- Target Foresight Based Attention for Neural Machine Translation (Li et al., NAACL 2018)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/N18-1125.pdf