Evaluating the Interplay of Information Status and Information Content in a Multilingual Parallel Corpus
Julius Steuer, Toshiki Nakai, Andrew Thomas Dyer, Luigi Talamo, Annemarie Verkerk
Abstract
The uniform information density (UID) hypothesis postulates that linguistic units are distributed in a text in such a way that the variance around an average information density is minimized. The relationship between information density and information status (IS) is so far underexplored. In this ongoing work, we project IS annotations on the English section of the CIEP+ corpus (Verkerk Talamo 2024) to parallel sections in other languages. We then use the projected annotations to evaluate the relationship between IS and information content in a typologically diverse sample of languages. Our preliminary findings indicate that there is an effect of information status on information density, with the directionality of the effect depending on language and part of speech.- Anthology ID:
- 2026.sigtyp-main.3
- Volume:
- Proceedings of the 8th Workshop on Research in Computational Linguistic Typology and Multilingual NLP
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Ekaterina Vylomova, Andrei Shcherbakov, Priya Rani
- Venues:
- SIGTYP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 18–25
- Language:
- URL:
- https://preview.aclanthology.org/ingest-ccl/2026.sigtyp-main.3/
- DOI:
- Cite (ACL):
- Julius Steuer, Toshiki Nakai, Andrew Thomas Dyer, Luigi Talamo, and Annemarie Verkerk. 2026. Evaluating the Interplay of Information Status and Information Content in a Multilingual Parallel Corpus. In Proceedings of the 8th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, pages 18–25, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- Evaluating the Interplay of Information Status and Information Content in a Multilingual Parallel Corpus (Steuer et al., SIGTYP 2026)
- PDF:
- https://preview.aclanthology.org/ingest-ccl/2026.sigtyp-main.3.pdf