Minakshi Pradeep Atre


2025

pdf bib
Stacked LoRA: Isolated Low-Rank Adaptation for Lifelong Knowledge Management
Heramb Vivek Patil | Vaishnavee Sanam | Minakshi Pradeep Atre
The 14th International Joint Conference on Natural Language Processing and The 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

Continual learning (CL) presents a significant challenge for large pre-trained models, primarily due to catastrophic forgetting and the high computational cost of sequential knowledge updating. Parameter-Efficient Transfer Learning (PETL) methods offer reduced computational burdens but often struggle to effectively mitigate forgetting. This paper introduces Stacked Low-Rank Adaptation (SLoRA), a novel parameter-efficient approach that leverages the additive composition of task-specific, frozen low-rank adapters to enable modular continual learning with inherent support for explicit knowledge modification. SLoRA was evaluated on vision benchmarks, BERT-base, and the 1-billion-parameter Llama-3.2-1B model. Experiments demonstrated that SLoRA almost completely eliminated catastrophic forgetting, achieving a final average accuracy of 92.75% on Llama-3.2-1B while perfectly preserving prior task performance. Furthermore, SLoRA is computationally efficient, enabling up to a 15x training speed-up over full fine-tuning with 99.7% fewer trainable parameters per update. SLoRA offers a compelling balance of forgetting mitigation, parameter efficiency, and modularity, representing a promising direction for developing adaptable and efficient lifelong knowledgeable foundation models.