On the Complexity and Typology of Inflectional Morphological Systems

Ryan Cotterell, Christo Kirov, Mans Hulden, Jason Eisner


Abstract
We quantify the linguistic complexity of different languages’ morphological systems. We verify that there is a statistically significant empirical trade-off between paradigm size and irregularity: A language’s inflectional paradigms may be either large in size or highly irregular, but never both. We define a new measure of paradigm irregularity based on the conditional entropy of the surface realization of a paradigm— how hard it is to jointly predict all the word forms in a paradigm from the lemma. We estimate irregularity by training a predictive model. Our measurements are taken on large morphological paradigms from 36 typologically diverse languages.
Anthology ID:
Q19-1021
Volume:
Transactions of the Association for Computational Linguistics, Volume 7
Month:
Year:
2019
Address:
Cambridge, MA
Editors:
Lillian Lee, Mark Johnson, Brian Roark, Ani Nenkova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
327–342
Language:
URL:
https://aclanthology.org/Q19-1021
DOI:
10.1162/tacl_a_00271
Bibkey:
Cite (ACL):
Ryan Cotterell, Christo Kirov, Mans Hulden, and Jason Eisner. 2019. On the Complexity and Typology of Inflectional Morphological Systems. Transactions of the Association for Computational Linguistics, 7:327–342.
Cite (Informal):
On the Complexity and Typology of Inflectional Morphological Systems (Cotterell et al., TACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/Q19-1021.pdf