Batch-wise Convergent Pre-training: Step-by-Step Learning Inspired by Child Language Development

Ko Yoshida, Daiki Shiono, Kai Sato, Toko Miura, Momoka Furuhashi, Jun Suzuki


Abstract
Human children acquire language from a substantially smaller amount of linguistic input than that typically required for training large language models (LLMs). This gap motivates the search for more efficient pre-training methods. Inspired by child development, curriculum learning, which progresses from simple to complex data, has been widely adopted. In this study, we propose a pre-training framework that mirrors child language acquisition, advancing step by step from words to sentences while retaining prior knowledge. We investigate whether this improves retention and efficiency under limited resources. Our approach is implemented through four components: (i) a curriculum-aligned dataset, (ii) a batch-wise convergence loop, (iii) a distance-controlled loss to mitigate forgetting, and (iv) a constraint-controlled optimizer for stability. Experiments on the BabyLM benchmark show that the proposed method performs slightly below the official baselines in overall accuracy, with larger gaps on grammar-oriented evaluations such as BLiMP. Nonetheless, it yields small but consistent gains on morphology- and discourse-related tasks (e.g., WUG-ADJ, Entity Tracking), suggesting that the approach affects different linguistic aspects unevenly under limited data conditions.
Anthology ID:
2025.babylm-main.36
Volume:
Proceedings of the First BabyLM Workshop
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Lucas Charpentier, Leshem Choshen, Ryan Cotterell, Mustafa Omer Gul, Michael Y. Hu, Jing Liu, Jaap Jumelet, Tal Linzen, Aaron Mueller, Candace Ross, Raj Sanjay Shah, Alex Warstadt, Ethan Gotlieb Wilcox, Adina Williams
Venue:
BabyLM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
508–524
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.babylm-main.36/
DOI:
Bibkey:
Cite (ACL):
Ko Yoshida, Daiki Shiono, Kai Sato, Toko Miura, Momoka Furuhashi, and Jun Suzuki. 2025. Batch-wise Convergent Pre-training: Step-by-Step Learning Inspired by Child Language Development. In Proceedings of the First BabyLM Workshop, pages 508–524, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Batch-wise Convergent Pre-training: Step-by-Step Learning Inspired by Child Language Development (Yoshida et al., BabyLM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.babylm-main.36.pdf