Michaela Koščová
2025
Syntactic units and their length distributions: A case study in Czech
Michaela Nogolová
|
Michaela Koščová
|
Jan Macutek
|
Radek Cech
Proceedings of the Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025)
This study investigates the length distributions of syntactic units in Czech across multiple hierarchical levels: sentences, independent clauses, clauses, phrases, subphrases, and chunks. Using a diverse dataset – including Universal Dependency treebanks, presidential speeches, the Czech Bible, and random sample from corpora of modern Czech – the analysis examines whether lengths of these syntactic units follow consistent distributional patterns. Length is defined as the number of immediate subunits, and the distributions were modeled using the hyper-Poisson distribution. The results demonstrate that the hyper-Poisson model fits well distributions of length of all abovementioned syntactic units, pointing to a common principle underlying the organization of syntactic structure in Czech.
2015
On the relation between verb full valency and synonymy
Radek Čech
|
Ján Mačutek
|
Michaela Koščová
Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015)