Abstract
In this work we analyze and compare the behavior of the Transformer architecture when using different positional encoding methods. While absolute and relative positional encoding perform equally strong overall, we show that relative positional encoding is vastly superior (4.4% to 11.9% BLEU) when translating a sentence that is longer than any observed training sentence. We further propose and analyze variations of relative positional encoding and observe that the number of trainable parameters can be reduced without a performance loss, by using fixed encoding vectors or by removing some of the positional encoding vectors.- Anthology ID:
- 2019.iwslt-1.20
- Volume:
- Proceedings of the 16th International Conference on Spoken Language Translation
- Month:
- November 2-3
- Year:
- 2019
- Address:
- Hong Kong
- Editors:
- Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2019.iwslt-1.20/
- DOI:
- Cite (ACL):
- Jan Rosendahl, Viet Anh Khoa Tran, Weiyue Wang, and Hermann Ney. 2019. Analysis of Positional Encodings for Neural Machine Translation. In Proceedings of the 16th International Conference on Spoken Language Translation, Hong Kong. Association for Computational Linguistics.
- Cite (Informal):
- Analysis of Positional Encodings for Neural Machine Translation (Rosendahl et al., IWSLT 2019)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2019.iwslt-1.20.pdf