UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets
Wenyu Wang, Mengqi Zhang, Xiaotian Ye, Zhaochun Ren, Pengjie Ren, Zhumin Chen
Abstract
Large Language Models (LLMs) inevitably acquire harmful information during training on massive datasets. LLM unlearning aims to eliminate the influence of such harmful information while maintaining the model’s overall performance. Existing unlearning methods, represented by gradient ascent-based approaches, primarily focus on forgetting target data while overlooking the crucial impact of logically related knowledge on the effectiveness of unlearning. In this paper, through both theoretical and experimental analyses, we first demonstrate that a key reason for the suboptimal unlearning performance is that models can reconstruct the target content through reasoning with logically related knowledge. To address this issue, we propose Unlearning Improvement via Parameter Extrapolation (UIPE), a method that removes knowledge highly correlated with the forgetting targets. Experimental results show that UIPE significantly enhances the performance of GA-based method and its variants on the TOFU and WMDP benchmarks.- Anthology ID:
- 2025.findings-emnlp.1374
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 25212–25227
- Language:
- URL:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1374/
- DOI:
- 10.18653/v1/2025.findings-emnlp.1374
- Cite (ACL):
- Wenyu Wang, Mengqi Zhang, Xiaotian Ye, Zhaochun Ren, Pengjie Ren, and Zhumin Chen. 2025. UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 25212–25227, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets (Wang et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1374.pdf