Compression-based Language Complexity under Register Variation in Portuguese

Felipe Ribas Serras, Marcelo Finger


Abstract
Compression-based language complexity metrics show promise as holistic parameters for measuring linguistic complexity across intra- and cross-linguistic scenarios. Yet, their sensitivity to specific forms of linguistic variation requires further experimental validation. We examine the sensitivity of this metric family to register variation in Portuguese, a phenomenon already established for English. We refine the validation process found in previous literature by introducing a more granular statistical analysis to evaluate both the individual and joint sensitivity of these metrics to register variation at the sentence level. Our results confirm they are highly sensitive to functional variation in Portuguese, exhibiting the same structural morphosyntactic trade-off consistent with that observed in English and in cross-linguistic studies.
Anthology ID:
2026.propor-1.80
Volume:
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Month:
April
Year:
2026
Address:
Salvador, Brazil
Editors:
Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:
PROPOR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
808–818
Language:
URL:
https://preview.aclanthology.org/ingest-dnd/2026.propor-1.80/
DOI:
Bibkey:
Cite (ACL):
Felipe Ribas Serras and Marcelo Finger. 2026. Compression-based Language Complexity under Register Variation in Portuguese. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 808–818, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):
Compression-based Language Complexity under Register Variation in Portuguese (Serras & Finger, PROPOR 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-dnd/2026.propor-1.80.pdf