Priorless Recurrent Networks Learn Curiously

Jeff Mitchell, Jeffrey Bowers


Abstract
Recently, domain-general recurrent neural networks, without explicit linguistic inductive biases, have been shown to successfully reproduce a range of human language behaviours, such as accurately predicting number agreement between nouns and verbs. We show that such networks will also learn number agreement within unnatural sentence structures, i.e. structures that are not found within any natural languages and which humans struggle to process. These results suggest that the models are learning from their input in a manner that is substantially different from human language acquisition, and we undertake an analysis of how the learned knowledge is stored in the weights of the network. We find that while the model has an effective understanding of singular versus plural for individual sentences, there is a lack of a unified concept of number agreement connecting these processes across the full range of inputs. Moreover, the weights handling natural and unnatural structures overlap substantially, in a way that underlines the non-human-like nature of the knowledge learned by the network.
Anthology ID:
2020.coling-main.451
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5147–5158
Language:
URL:
https://aclanthology.org/2020.coling-main.451
DOI:
10.18653/v1/2020.coling-main.451
Bibkey:
Cite (ACL):
Jeff Mitchell and Jeffrey Bowers. 2020. Priorless Recurrent Networks Learn Curiously. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5147–5158, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Priorless Recurrent Networks Learn Curiously (Mitchell & Bowers, COLING 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.coling-main.451.pdf
Data
Universal Dependencies