Expect the Unexpected? Testing the Surprisal of Salient Entities

Jessica Lin, Amir Zeldes


Abstract
Previous work examining the Uniform Information Density (UID) hypothesis has shown that while information as measured by surprisal metrics is distributed more or less evenly across documents overall, local discrepancies can arise due to functional pressures corresponding to syntactic and discourse structural constraints. However, work thus far has largely disregarded the relative salience of discourse participants. We fill this gap by studying how overall salience of entities in discourse relates to surprisal using 70K manually annotated mentions across 16 genres of English and a novel minimal-pair prompting method. Our results show that globally salient entities exhibit significantly higher surprisal than non-salient ones, even controlling for position, length, and nesting confounds. Moreover, salient entities systematically reduce surprisal for surrounding content when used as prompts, enhancing document-level predictability. This effect varies by genre, appearing strongest in topic-coherent texts and weakest in conversational contexts. Our findings refine the UID competing pressures framework by identifying global entity salience as a mechanism shaping information distribution in discourse.
Anthology ID:
2026.acl-long.1413
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30618–30629
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1413/
DOI:
Bibkey:
Cite (ACL):
Jessica Lin and Amir Zeldes. 2026. Expect the Unexpected? Testing the Surprisal of Salient Entities. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 30618–30629, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Expect the Unexpected? Testing the Surprisal of Salient Entities (Lin & Zeldes, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1413.pdf
Checklist:
 2026.acl-long.1413.checklist.pdf