A New Formulation of Zipf’s Meaning-Frequency Law through Contextual Diversity

Ryo Nagata, Kumiko Tanaka-Ishii


Abstract
This paper proposes formulating Zipf’s meaning-frequency law, the power law between word frequency and the number of meanings, as a relationship between word frequency and contextual diversity. The proposed formulation quantifies meaning counts as contextual diversity, which is based on the directions of contextualized word vectors obtained from a Language Model (LM). This formulation gives a new interpretation to the law and also enables us to examine it for a wider variety of words and corpora than previous studies have explored. In addition, this paper shows that the law becomes unobservable when the size of the LM used is small and that autoregressive LMs require much more parameters than masked LMs to be able to observe the law.
Anthology ID:
2025.acl-long.744
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15323–15335
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.744/
DOI:
Bibkey:
Cite (ACL):
Ryo Nagata and Kumiko Tanaka-Ishii. 2025. A New Formulation of Zipf’s Meaning-Frequency Law through Contextual Diversity. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15323–15335, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
A New Formulation of Zipf’s Meaning-Frequency Law through Contextual Diversity (Nagata & Tanaka-Ishii, ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.744.pdf