Representation-Guided Parameter-Efficient LLM Unlearning

Zeguan Xiao, Lang Mo, Yun Chen, Lei Yang, Jiehui Zhao, Lili Yang, Guanhua Chen


Abstract
Large Language Models (LLMs) often memorize sensitive or harmful information, necessitating effective machine unlearning techniques. While existing parameter-efficient unlearning methods have shown promise, they still struggle with the forget-retain trade-off. This can be attributed to their reliance on parameter importance metrics to identify parameters that are important exclusively for forget set, which is fundamentally limited by the superposition phenomenon. Due to the polysemantic nature of LLMs parameters, such an importance metric may struggle to disentangle parameters associated with forget and retain sets. In this work, we propose Representation-Guided Low-rank Unlearning (ReGLU), a novel approach that leverages the geometric properties of representation spaces to achieve robust and precise unlearning. First, we develop a representation-guided initialization for LoRA that identifies the optimal subspace for selective forgetting. Second, we introduce a regularization loss that constrains the outputs of the LoRA update to lie in the orthogonal complement of the retain set’s representation subspace, thereby minimizing interference with the model’s performance on the retain set. We evaluate ReGLU on the TOFU and WMDP benchmarks across multiple models. Our results demonstrate that ReGLU consistently outperforms state-of-the-art baselines, achieving superior unlearning quality while maintaining higher model utility.
Anthology ID:
2026.findings-acl.717
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14602–14616
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.717/
DOI:
Bibkey:
Cite (ACL):
Zeguan Xiao, Lang Mo, Yun Chen, Lei Yang, Jiehui Zhao, Lili Yang, and Guanhua Chen. 2026. Representation-Guided Parameter-Efficient LLM Unlearning. In Findings of the Association for Computational Linguistics: ACL 2026, pages 14602–14616, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Representation-Guided Parameter-Efficient LLM Unlearning (Xiao et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.717.pdf
Checklist:
 2026.findings-acl.717.checklist.pdf