Abstract
Recurrent Neural Network Language Models composed of LSTM units, especially those augmented with an external memory, have achieved state-of-the-art results in Language Modeling. However, these models still struggle to process long sequences which are more likely to contain long-distance dependencies because of information fading. In this paper we demonstrate an effective mechanism for retrieving information in a memory augmented LSTM LM based on attending to information in memory in proportion to the number of timesteps the LSTM gating mechanism persisted the information.- Anthology ID:
- R19-1121
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 1052–1059
- Language:
- URL:
- https://aclanthology.org/R19-1121
- DOI:
- 10.26615/978-954-452-056-4_121
- Cite (ACL):
- Giancarlo Salton and John Kelleher. 2019. Persistence pays off: Paying Attention to What the LSTM Gating Mechanism Persists. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 1052–1059, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Persistence pays off: Paying Attention to What the LSTM Gating Mechanism Persists (Salton & Kelleher, RANLP 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/R19-1121.pdf