Function Words as Statistical Cues for Language Learning

Xiulin Yang, Heidi R. Getz, Ethan Gotlieb Wilcox


Abstract
What statistical properties might support learning abstract grammatical knowledge from linear input? We address this question by examining the statistical distribution of function words. Function words have been argued to aid acquisition through three distributional properties: high frequency, reliable syntactic association, and phrase-boundary alignment. We conduct a cross-linguistic corpus analysis of 186 languages, which confirms that all three properties are universal. Using counterfactual language modeling and ablation experiments on English, we show that preserving these properties facilitates acquisition in neural learners, with a Goldilocks effect: function words must be frequent enough to be reliable, yet diverse enough to remain informative to structural dependency. Probing analyses further reveal that different learning conditions produce systematically different reliance on function words.
Anthology ID:
2026.acl-long.728
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16042–16058
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.728/
DOI:
Bibkey:
Cite (ACL):
Xiulin Yang, Heidi R. Getz, and Ethan Gotlieb Wilcox. 2026. Function Words as Statistical Cues for Language Learning. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16042–16058, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Function Words as Statistical Cues for Language Learning (Yang et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.728.pdf
Checklist:
 2026.acl-long.728.checklist.pdf