How do we get there? Evaluating transformer neural networks as cognitive models for English past tense inflection

Xiaomeng Ma, Lingyu Gao


Abstract
There is an ongoing debate of whether neural network can grasp the quasi-regularities in languages like humans. In a typical quasi-regularity task, English past tense inflections, the neural network model has long been criticized that it learns only to generalize the most frequent pattern, but not the regular pattern, thus can not learn the abstract categories of regular and irregular and is dissimilar to human performance. In this work, we train a set of transformer models with different settings to examine their behavior on this task. The models achieved high accuracy on unseen regular verbs and some accuracy on unseen irregular verbs. The models’ performance on the regulars are heavily affected by type frequency and ratio but not token frequency and ratio, and vice versa for the irregulars. The different behaviors on the regulars and irregulars suggest that the models have some degree of symbolic learning on the regularity of the verbs. In addition, the models are weakly correlated with human behavior on nonce verbs. Although the transformer model exhibits some level of learning on the abstract category of verb regularity, its performance does not fit human data well suggesting that it might not be a good cognitive model.
Anthology ID:
2022.aacl-main.81
Volume:
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
November
Year:
2022
Address:
Online only
Editors:
Yulan He, Heng Ji, Sujian Li, Yang Liu, Chua-Hui Chang
Venues:
AACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1101–1114
Language:
URL:
https://aclanthology.org/2022.aacl-main.81
DOI:
Bibkey:
Cite (ACL):
Xiaomeng Ma and Lingyu Gao. 2022. How do we get there? Evaluating transformer neural networks as cognitive models for English past tense inflection. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1101–1114, Online only. Association for Computational Linguistics.
Cite (Informal):
How do we get there? Evaluating transformer neural networks as cognitive models for English past tense inflection (Ma & Gao, AACL-IJCNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2022.aacl-main.81.pdf