Lock on Target! Precision Unlearning via Directional Control
Yuntao Wen, Ruixiang Feng, Feng Guo, Yifan Wang, Ran Le, Yang Song, Shen Gao, Shuo Shang
Abstract
The unlearning method aims at effectively removing harmful, sensitive, or outdated knowledge without costly retraining the model. However, existing methods suffer from two critical limitations: (1) collateral forgetting, where erasing target data inadvertently removes related but desirable knowledge, and (2) generality forgetting, where aggressive unlearning degrades the model’s general capabilities. To address these challenges, we propose DirectiOn Guide unlEarning (DOGE), a novel method that enables precise knowledge erasure by identifying and leveraging a targeted “unlearning direction” in the model’s parameter space. DOGE first extracts this direction through differential analysis of representations for forgotten and retained samples, pinpointing the exact subspace associated with unwanted knowledge. It then selectively applies updates along this direction, ensuring minimal interference with retained information and general model performance. Experiments across multiple benchmarks demonstrate that Doge achieves state-of-the-art unlearning precision while preserving both related knowledge and general capabilities.- Anthology ID:
- 2025.findings-emnlp.1021
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 18782–18794
- Language:
- URL:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1021/
- DOI:
- 10.18653/v1/2025.findings-emnlp.1021
- Cite (ACL):
- Yuntao Wen, Ruixiang Feng, Feng Guo, Yifan Wang, Ran Le, Yang Song, Shen Gao, and Shuo Shang. 2025. Lock on Target! Precision Unlearning via Directional Control. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 18782–18794, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Lock on Target! Precision Unlearning via Directional Control (Wen et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1021.pdf