Identifying the Periodicity of Information in Natural Language

Yulin OU, Yu Wang, Yang Xu, Hendrik Buschmeier


Abstract
Recent theoretical advancement of information density in natural language has brought the following question on desk: To what degree does natural language exhibit periodicity pattern in its encoded information? We address this question by introducing a new method called AutoPeriod of Surprisal (APS). APS adopts a canonical periodicity detection algorithm and is able to identify any significant periods that exist in the surprisal sequence of a single document. By applying the algorithm to a set of corpora, we have obtained the following interesting results: Firstly, a considerable proportion of human language demonstrates a strong pattern of periodicity in information; Secondly, new periods that are outside the distributions of typical structural units in text (e.g., sentence boundaries, elementary discourse units, etc.) are found and further confirmed via harmonic regression modeling. We conclude that the periodicity of information in language is a joint outcome from both structured factors and other driving factors that take effect at longer distances. The advantages of our periodicity detection method and its potentials in LLM-generation detection are further discussed.
Anthology ID:
2026.acl-long.52
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1161–1175
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.52/
DOI:
Bibkey:
Cite (ACL):
Yulin OU, Yu Wang, Yang Xu, and Hendrik Buschmeier. 2026. Identifying the Periodicity of Information in Natural Language. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1161–1175, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Identifying the Periodicity of Information in Natural Language (OU et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.52.pdf
Checklist:
 2026.acl-long.52.checklist.pdf