Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models

Yuval Weiss, David Demitri Africa, Paula Buttery, Richard Diehl Martinez


Abstract
Parameter-efficient methods like LoRA have revolutionised large language model (LLM) fine-tuning. ReLoRA extends this idea to pretraining by repeatedly merging and reinitialising low-rank adapters, increasing cumulative rank while keeping updates cheap. This aligns well with observations that high-capacity models learn through locally low-rank trajectories that expand over time. By contrast, recent work suggests that small language models (SLMs) exhibit rank deficiencies and under-utilise their available dimensionality. This raises a natural question: can ReLoRA’s rank-expanding update rule steer SLMs toward healthier learning dynamics, mitigating rank bottlenecks in a capacity-constrained regime? We argue SLMs are an ideal testbed: they train quickly, enable controlled ablations, and make rank phenomena more measurable. We present the first systematic study of ReLoRA in SLMs (11M-66M parameters), evaluating both performance and learning dynamics. Across loss, Paloma perplexity, and BLiMP, we find that ReLoRA underperforms full-rank training, with gaps widening at larger scales. Analysis of proportional effective rank and condition numbers shows that ReLoRA amplifies existing rank deficiencies and induces ill-conditioned updates early in training. Our results suggest that while ReLoRA’s merge-and-restart strategy can expand ranks in larger models, it does not straightforwardly translate to capacity-limited SLMs, motivating adaptive-rank or hybrid-rank approaches for low-compute pretraining.
Anthology ID:
2025.blackboxnlp-1.9
Volume:
Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Yonatan Belinkov, Aaron Mueller, Najoung Kim, Hosein Mohebbi, Hanjie Chen, Dana Arad, Gabriele Sarti
Venues:
BlackboxNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
163–175
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.blackboxnlp-1.9/
DOI:
Bibkey:
Cite (ACL):
Yuval Weiss, David Demitri Africa, Paula Buttery, and Richard Diehl Martinez. 2025. Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models. In Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 163–175, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models (Weiss et al., BlackboxNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.blackboxnlp-1.9.pdf