Abstract
We train neural networks to optimize a Minimum Description Length score, that is, to balance between the complexity of the network and its accuracy at a task. We show that networks optimizing this objective function master tasks involving memory challenges and go beyond context-free languages. These learners master languages such as anbn, anbncn, anb2n, anbmcn +m, and they perform addition. Moreover, they often do so with 100% accuracy. The networks are small, and their inner workings are transparent. We thus provide formal proofs that their perfect accuracy holds not only on a given test set, but for any input sequence. To our knowledge, no other connectionist model has been shown to capture the underlying grammars for these languages in full generality.- Anthology ID:
- 2022.tacl-1.45
- Volume:
- Transactions of the Association for Computational Linguistics, Volume 10
- Month:
- Year:
- 2022
- Address:
- Cambridge, MA
- Editors:
- Brian Roark, Ani Nenkova
- Venue:
- TACL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 785–799
- Language:
- URL:
- https://aclanthology.org/2022.tacl-1.45
- DOI:
- 10.1162/tacl_a_00489
- Cite (ACL):
- Nur Lan, Michal Geyer, Emmanuel Chemla, and Roni Katzir. 2022. Minimum Description Length Recurrent Neural Networks. Transactions of the Association for Computational Linguistics, 10:785–799.
- Cite (Informal):
- Minimum Description Length Recurrent Neural Networks (Lan et al., TACL 2022)
- PDF:
- https://preview.aclanthology.org/bionlp-24-ingestion/2022.tacl-1.45.pdf