Unveiling and Addressing Pseudo Forgetting in Large Language Models

Huashan Sun, Yizhe Yang, Yinghao Li, Jiawei Li, Yang Gao


Abstract
Although substantial efforts have been made to mitigate catastrophic forgetting in continual learning, the intrinsic mechanisms are not well understood. In this work, we demonstrate the existence of “pseudo forgetting”: the performance degradation in previous tasks is not attributed to a loss of capabilities, but rather to the failure of the instructions to activate the appropriate model capabilities. We show that the model’s performance on previous tasks can be restored through two simple interventions: (1) providing partial external correct rationale, and (2) appending semantically meaningless suffixes to the original instructions, to guide the generation of correct rationales. Through empirical analysis of the internal mechanisms governing rationale generation, we reveal that models exhibiting pseudo forgetting show reduced instruction dependence during rationale generation, leading to suboptimal activation of their inherent capabilities. Based on this insight, we propose Rationale-Guidance Difficulty based Replay (RGD-R) framework that dynamically allocates replay data based on the model’s ability to correctly leverage the intrinsic capabilities. Experimental results demonstrate that RGD-R effectively mitigates pseudo forgetting while maintaining model plasticity.
Anthology ID:
2025.findings-acl.1212
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:
Findings | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23642–23658
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1212/
DOI:
Bibkey:
Cite (ACL):
Huashan Sun, Yizhe Yang, Yinghao Li, Jiawei Li, and Yang Gao. 2025. Unveiling and Addressing Pseudo Forgetting in Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 23642–23658, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Unveiling and Addressing Pseudo Forgetting in Large Language Models (Sun et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1212.pdf