Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models
Yuval Weiss, David Demitri Africa, Paula Buttery, Richard Diehl Martinez
Abstract
Parameter-efficient methods like LoRA have revolutionised large language model (LLM) fine-tuning. ReLoRA extends this idea to pretraining by repeatedly merging and reinitialising low-rank adapters, increasing cumulative rank while keeping updates cheap. This aligns well with observations that high-capacity models learn through locally low-rank trajectories that expand over time. By contrast, recent work suggests that small language models (SLMs) exhibit rank deficiencies and under-utilise their available dimensionality. This raises a natural question: can ReLoRA’s rank-expanding update rule steer SLMs toward healthier learning dynamics, mitigating rank bottlenecks in a capacity-constrained regime? We argue SLMs are an ideal testbed: they train quickly, enable controlled ablations, and make rank phenomena more measurable. We present the first systematic study of ReLoRA in SLMs (11M-66M parameters), evaluating both performance and learning dynamics. Across loss, Paloma perplexity, and BLiMP, we find that ReLoRA underperforms full-rank training, with gaps widening at larger scales. Analysis of proportional effective rank and condition numbers shows that ReLoRA amplifies existing rank deficiencies and induces ill-conditioned updates early in training. Our results suggest that while ReLoRA’s merge-and-restart strategy can expand ranks in larger models, it does not straightforwardly translate to capacity-limited SLMs, motivating adaptive-rank or hybrid-rank approaches for low-compute pretraining.- Anthology ID:
- 2025.blackboxnlp-1.9
- Volume:
- Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Yonatan Belinkov, Aaron Mueller, Najoung Kim, Hosein Mohebbi, Hanjie Chen, Dana Arad, Gabriele Sarti
- Venues:
- BlackboxNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 163–175
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.blackboxnlp-1.9/
- DOI:
- Cite (ACL):
- Yuval Weiss, David Demitri Africa, Paula Buttery, and Richard Diehl Martinez. 2025. Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models. In Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 163–175, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models (Weiss et al., BlackboxNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.blackboxnlp-1.9.pdf