UNLEARN Efficient Removal of Knowledge in Large Language Models

Tyler Lizzo, Larry Heck


Abstract
Large Language Models (LLMs) excel in many Natural Language Processing tasks but are outperformed by specialized tools for certain tasks. This raises the question: Can we reduce redundant LLM parameters when using these tools? Given the size and high training costs of LLMs, it is essential to efficiently forget specific knowledge without retraining. This paper introduces UNLEARN, a novel method that uses subspace techniques to selectively remove knowledge without access to the original training data, without retraining, and with minimal impact to other tasks. Our results show that UNLEARN significantly outperforms previous methods for forgetting targeted (unwanted) knowledge while also preserving related (wanted) knowledge. We also propose LEARN, a complementary approach for targeted knowledge addition, which achieves fine-tuning accuracy comparable to Low-Rank Adaptation (LoRA) without degrading related task performance.
Anthology ID:
2025.findings-naacl.405
Volume:
Findings of the Association for Computational Linguistics: NAACL 2025
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7257–7268
Language:
URL:
https://preview.aclanthology.org/moar-dois/2025.findings-naacl.405/
DOI:
10.18653/v1/2025.findings-naacl.405
Bibkey:
Cite (ACL):
Tyler Lizzo and Larry Heck. 2025. UNLEARN Efficient Removal of Knowledge in Large Language Models. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 7257–7268, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
UNLEARN Efficient Removal of Knowledge in Large Language Models (Lizzo & Heck, Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/moar-dois/2025.findings-naacl.405.pdf