An Empirical Study of Language Syllabification using Syllabary and Lexical Networks

Rusali Saha, Yannick Marchand


Abstract
Language syllabification is the separation of a word into written or spoken syllables. The study of syllabification plays a pivotal role in morphology and there have been previous attempts to study this phenomenon using graphs or networks. Previous approaches have claimed through visual estimation that the degree distribution of language networks follows the Power Law distribution, however, there have not been any empirically grounded metrics to determine the same. In our study, we implement two kinds of language networks, namely, syllabary and lexical networks, and investigate the syllabification of four European languages: English, French, German and Spanish using network analysis and examine their small-world, random and scale-free nature. We additionally empirically prove that contrary to claims in previous works, although the degree distribution of these networks appear to follow a power law distribution, they are actually more in agreement with a log-normal distribution, when a numerically grounded curve-fitting is applied. Finally, we explore how syllabary and lexical networks for the English language change over time using a database of age-of-acquisition rating words. Our analysis further shows that the preferential attachment mechanism appears to be a well-grounded explanation for the degree distribution of the syllabary network.
Anthology ID:
2025.cmcl-1.24
Volume:
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico, USA
Editors:
Tatsuki Kuribayashi, Giulia Rambelli, Ece Takmaz, Philipp Wicke, Jixing Li, Byung-Doh Oh
Venues:
CMCL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
197–206
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.cmcl-1.24/
DOI:
Bibkey:
Cite (ACL):
Rusali Saha and Yannick Marchand. 2025. An Empirical Study of Language Syllabification using Syllabary and Lexical Networks. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pages 197–206, Albuquerque, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
An Empirical Study of Language Syllabification using Syllabary and Lexical Networks (Saha & Marchand, CMCL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.cmcl-1.24.pdf