ELLA: Efficient Lifelong Learning for Adapters in Large Language Models

Shristi Das Biswas; Yue Zhang; Anwesan Pal; Radhika Bhargava; Kaushik Roy

ELLA: Efficient Lifelong Learning for Adapters in Large Language Models

Shristi Das Biswas, Yue Zhang, Anwesan Pal, Radhika Bhargava, Kaushik Roy

Abstract

Large Language Models (LLMs) suffer from severe catastrophic forgetting when adapted sequentially to new tasks in a continual learning (CL) setting. Existing approaches are fundamentally limited: replay-based methods are impractical and could potentially violate privacy, while strict orthogonality-based methods collapse under scale: each new task is projected onto an orthogonal complement, progressively reducing the residual degrees of freedom and eliminating forward transfer by forbidding overlap in shared representations. In this work, we introduce ELLA, a training framework built on the principle of selective subspace de-correlation. Rather than forbidding all overlap, ELLA explicitly characterizes the structure of past updates and penalizes alignments along their high-energy, task-specific directions, while preserving freedom in the low-energy residual subspaces to enable transfer. Formally, this is realized via a lightweight regularizer on a single aggregated update matrix. This mechanism is proven to be an anisotropic shrinkage operator that bounds interference, yielding a penalty that is both memory- and compute-constant regardless of task sequence length. ELLA requires no data replay, no architectural expansion, and negligible storage. Empirically, it achieves state-of-the-art CL performance on three popular benchmarks spanning both classification and generative tasks, with relative accuracy gains of up to 9.6% and a 35× smaller memory footprint. Furthermore, ELLA scales robustly across architectures and actively enhances the model’s zero-shot generalization performance on unseen tasks, establishing a principled and scalable solution for constructive lifelong LLM adaptation.

Anthology ID:: 2026.eacl-long.84
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1907–1924
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.84/
DOI:
Bibkey:
Cite (ACL):: Shristi Das Biswas, Yue Zhang, Anwesan Pal, Radhika Bhargava, and Kaushik Roy. 2026. ELLA: Efficient Lifelong Learning for Adapters in Large Language Models. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1907–1924, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: ELLA: Efficient Lifelong Learning for Adapters in Large Language Models (Das Biswas et al., EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.84.pdf

PDF Cite Search Fix data