Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers

Akhilesh Kakolu Ramarao, Kevin Tang, Dinah Baer-Henney


Abstract
Over the past decade, various studies have addressed how speakers solve the so-called ‘The Paradigm Cell Filling Problem’ (PCFP) (CITATION) across different languages. The PCFP addresses a fundamental question in morphological processing: how do speakers accurately generate inflected forms of words when presented with incomplete paradigms? This problem is particularly salient when modeling complex inflectional systems. We focus on Spanish verbal paradigms, where certain verbs follow an irregular L-shaped pattern, where the first-person singular present indicative stem matches the stem used throughout the present subjunctive mood. We formulate the problem as a morphological reinflection task. Specifically, we investigate the role of input frequency in the acquisition of regular versus irregular L-shaped patterns in transformer models. By systematically manipulating the input distributions and analyzing model behavior, we reveal four key findings: 1) Models perform better on L-shaped verbs compared to regular verbs, especially in uneven frequency conditions; 2) Robust primacy effects are observed, but no consistent recency effects; 3) Memorization becomes more prominent as the proportion of L-shaped verbs increases; 4) There is a tendency to regularize L-shaped verbs when their consonant alternation pairs are rare or absent in the training data.
Anthology ID:
2025.findings-acl.230
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4474–4489
Language:
URL:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.230/
DOI:
Bibkey:
Cite (ACL):
Akhilesh Kakolu Ramarao, Kevin Tang, and Dinah Baer-Henney. 2025. Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers. In Findings of the Association for Computational Linguistics: ACL 2025, pages 4474–4489, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers (Ramarao et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.230.pdf