Development of Cognitive Intelligence in Pre-trained Language Models

Raj Sanjay Shah, Khushi Bhardwaj, Sashank Varma


Abstract
Recent studies show evidence for emergent cognitive abilities in Large Pre-trained Language Models (PLMs). The increasing cognitive alignment of these models has made them candidates for cognitive science theories. Prior research into the emergent cognitive abilities of PLMs has been path-independent to model training, i.e. has only looked at the final model weights and not the intermediate steps. However, building plausible models of human cognition using PLMs also requires aligning their performance during training to the developmental trajectories of children’s thinking. Guided by psychometric tests of human intelligence, we choose four task categories to investigate the alignment of ten popular families of PLMs and evaluate each of their available intermediate and final training steps: Numerical ability, Linguistic abilities, Conceptual understanding, and Fluid reasoning. We find a striking regularity: regardless of model size, the developmental trajectories of PLMs consistently exhibit a window of maximal alignment to human cognitive development. Before that window, training appears to endow models with the requisite structure to be poised to rapidly learn from experience. After that window, training appears to serve the engineering goal of reducing loss but not the scientific goal of increasing alignment with human cognition.
Anthology ID:
2024.emnlp-main.539
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9632–9657
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2024.emnlp-main.539/
DOI:
10.18653/v1/2024.emnlp-main.539
Bibkey:
Cite (ACL):
Raj Sanjay Shah, Khushi Bhardwaj, and Sashank Varma. 2024. Development of Cognitive Intelligence in Pre-trained Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 9632–9657, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Development of Cognitive Intelligence in Pre-trained Language Models (Shah et al., EMNLP 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2024.emnlp-main.539.pdf
Software:
 2024.emnlp-main.539.software.zip
Data:
 2024.emnlp-main.539.data.zip