CurLL: A Developmental Framework to Evaluate Continual Learning in Language Models

Pavan Kalyan Tankala, Shubhra Mishra, Satya Lokam, Navin Goyal


Abstract
We introduce a comprehensive continual learning dataset and benchmark CurLL grounded in human developmental trajectories from ages 5–10, enabling systematic and fine-grained assessment of models’ ability to progressively acquire new skills. CurLL spans five developmental stages (0–4) covering ages 5–10, with a skill graph of 32 high-level skills, 128 sub-skills, 350+ goals, and 1,300+ indicators explicitly modeling prerequisite relationships. We generate a 23.4B-token synthetic dataset with controlled skill progression, vocabulary complexity, and format diversity, comprising paragraphs, comprehension-based QA (CQA), skill-testing QA (CSQA), and instruction–response (IR) pairs. Stage-wise token counts range from 2.12B to 6.78B tokens, supporting precise analysis of forgetting, forward transfer, and backward transfer. Using a 135M-parameter transformer trained under independent, joint, and sequential (continual) setups, we show trade-offs in skill retention and transfer efficiency. By mirroring human learning patterns and providing fine-grained control over skill dependencies, this work advances continual learning evaluations for language models.
Anthology ID:
2025.babylm-main.20
Volume:
Proceedings of the First BabyLM Workshop
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Lucas Charpentier, Leshem Choshen, Ryan Cotterell, Mustafa Omer Gul, Michael Y. Hu, Jing Liu, Jaap Jumelet, Tal Linzen, Aaron Mueller, Candace Ross, Raj Sanjay Shah, Alex Warstadt, Ethan Gotlieb Wilcox, Adina Williams
Venue:
BabyLM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
256–278
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.babylm-main.20/
DOI:
Bibkey:
Cite (ACL):
Pavan Kalyan Tankala, Shubhra Mishra, Satya Lokam, and Navin Goyal. 2025. CurLL: A Developmental Framework to Evaluate Continual Learning in Language Models. In Proceedings of the First BabyLM Workshop, pages 256–278, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
CurLL: A Developmental Framework to Evaluate Continual Learning in Language Models (Tankala et al., BabyLM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.babylm-main.20.pdf