Lock on Target! Precision Unlearning via Directional Control

Yuntao Wen, Ruixiang Feng, Feng Guo, Yifan Wang, Ran Le, Yang Song, Shen Gao, Shuo Shang


Abstract
The unlearning method aims at effectively removing harmful, sensitive, or outdated knowledge without costly retraining the model. However, existing methods suffer from two critical limitations: (1) collateral forgetting, where erasing target data inadvertently removes related but desirable knowledge, and (2) generality forgetting, where aggressive unlearning degrades the model’s general capabilities. To address these challenges, we propose DirectiOn Guide unlEarning (DOGE), a novel method that enables precise knowledge erasure by identifying and leveraging a targeted “unlearning direction” in the model’s parameter space. DOGE first extracts this direction through differential analysis of representations for forgotten and retained samples, pinpointing the exact subspace associated with unwanted knowledge. It then selectively applies updates along this direction, ensuring minimal interference with retained information and general model performance. Experiments across multiple benchmarks demonstrate that Doge achieves state-of-the-art unlearning precision while preserving both related knowledge and general capabilities.
Anthology ID:
2025.findings-emnlp.1021
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
18782–18794
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1021/
DOI:
10.18653/v1/2025.findings-emnlp.1021
Bibkey:
Cite (ACL):
Yuntao Wen, Ruixiang Feng, Feng Guo, Yifan Wang, Ran Le, Yang Song, Shen Gao, and Shuo Shang. 2025. Lock on Target! Precision Unlearning via Directional Control. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 18782–18794, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Lock on Target! Precision Unlearning via Directional Control (Wen et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1021.pdf
Checklist:
 2025.findings-emnlp.1021.checklist.pdf