An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning

Andrew Bai; Chih-Kuan Yeh; Cho-Jui Hsieh; Ankur Taly

An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning

Andrew Bai, Chih-Kuan Yeh, Cho-Jui Hsieh, Ankur Taly

Abstract

Incrementally fine-tuning foundational models on new tasks or domains is now the de facto approach in NLP. A known pitfall of this approach is the catastrophic forgetting of prior knowledge that happens during fine-tuning. A common approach to alleviate such forgetting is to rehearse samples from prior tasks during fine-tuning. Several existing works assume a fixed memory buffer to store prior task examples, while relying on inferences (forward passes) with the model at hand for choosing examples for rehearsal from the buffer. However, given the increasing computational cost of model inference, and decreasing cost of data storage, we focus on the setting to rehearse samples with a fixed computational budget instead of a fixed memory budget. We propose a sampling scheme, mix-cd, that prioritizes rehearsal of “collateral damage” samples, which are samples predicted correctly by the prior model but forgotten by the incrementally tuned one. The crux of our scheme is a procedure to efficiently estimate the density of collateral damage samples without incurring additional model inferences. Our approach is computationally efficient, easy to implement, and outperforms several leading continual learning methods in compute-constrained settings. All the code will be publicly available at https://github.com/jybai/mix-cd-rehearsal.

Anthology ID:: 2025.findings-naacl.138
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2557–2569
Language:
URL:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.findings-naacl.138/
DOI:
Bibkey:
Cite (ACL):: Andrew Bai, Chih-Kuan Yeh, Cho-Jui Hsieh, and Ankur Taly. 2025. An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 2557–2569, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning (Bai et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.findings-naacl.138.pdf

PDF Cite Search Fix data