Dependency-Based Self-Attention for Transformer NMT
Abstract
In this paper, we propose a new Transformer neural machine translation (NMT) model that incorporates dependency relations into self-attention on both source and target sides, dependency-based self-attention. The dependency-based self-attention is trained to attend to the modifiee for each token under constraints based on the dependency relations, inspired by Linguistically-Informed Self-Attention (LISA). While LISA is originally proposed for Transformer encoder for semantic role labeling, this paper extends LISA to Transformer NMT by masking future information on words in the decoder-side dependency-based self-attention. Additionally, our dependency-based self-attention operates at sub-word units created by byte pair encoding. The experiments show that our model improves 1.0 BLEU points over the baseline model on the WAT’18 Asian Scientific Paper Excerpt Corpus Japanese-to-English translation task.- Anthology ID:
- R19-1028
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 239–246
- Language:
- URL:
- https://aclanthology.org/R19-1028
- DOI:
- 10.26615/978-954-452-056-4_028
- Cite (ACL):
- Hiroyuki Deguchi, Akihiro Tamura, and Takashi Ninomiya. 2019. Dependency-Based Self-Attention for Transformer NMT. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 239–246, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Dependency-Based Self-Attention for Transformer NMT (Deguchi et al., RANLP 2019)
- PDF:
- https://preview.aclanthology.org/teach-a-man-to-fish/R19-1028.pdf