LSTMs Exploit Linguistic Attributes of Data

Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan, Noah A. Smith


Abstract
While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data. We investigate how the properties of natural language data affect an LSTM’s ability to learn a nonlinguistic task: recalling elements from its input. We find that models trained on natural language data are able to recall tokens from much longer sequences than models trained on non-language sequential data. Furthermore, we show that the LSTM learns to solve the memorization task by explicitly using a subset of its neurons to count timesteps in the input. We hypothesize that the patterns and structure in natural language data enable LSTMs to learn by providing approximate ways of reducing loss, but understanding the effect of different training data on the learnability of LSTMs remains an open question.
Anthology ID:
W18-3024
Volume:
Proceedings of the Third Workshop on Representation Learning for NLP
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venue:
RepL4NLP
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
180–186
Language:
URL:
https://aclanthology.org/W18-3024
DOI:
10.18653/v1/W18-3024
Bibkey:
Cite (ACL):
Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan, and Noah A. Smith. 2018. LSTMs Exploit Linguistic Attributes of Data. In Proceedings of the Third Workshop on Representation Learning for NLP, pages 180–186, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
LSTMs Exploit Linguistic Attributes of Data (Liu et al., RepL4NLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/W18-3024.pdf
Data
Penn Treebank