Ruipeng Wang


2025

pdf bib
Neuron-Level Sequential Editing for Large Language Models
Houcheng Jiang | Junfeng Fang | Tianyu Zhang | Baolong Bi | An Zhang | Ruipeng Wang | Tao Liang | Xiang Wang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This work explores sequential model editing in large language models (LLMs), a critical task that involves modifying internal knowledge within LLMs continuously through multi-round editing, each incorporating updates or corrections to adjust the model’s outputs without the need for costly retraining. Existing model editing methods, especially those that alter model parameters, typically focus on single-round editing and often face significant challenges in sequential model editing-most notably issues of model forgetting and failure. To address these challenges, we introduce a new model editing method, namely Neuron-level Sequential Editing (NSE), tailored for supporting sequential model editing. Specifically, we optimize the target layer’s hidden states using the model’s original weights to prevent model failure. Furthermore, we iteratively select neurons in multiple layers for editing based on their activation values to mitigate model forgetting. Our empirical experiments demonstrate that NSE significantly outperforms current modifying parameters model editing methods, marking a substantial advancement in the field of sequential model editing. Our code is released on https://anonymous.4open.science/r/NSE-0A8D/.

pdf bib
Forget the Token and Pixel: Rethinking Gradient Ascent for Concept Unlearning in Multimodal Generative Models
Jiaqi Li | Chuanyi Zhang | Miaozeng Du | Hui Zhang | Yongrui Chen | Qianshan Wei | Junfeng Fang | Ruipeng Wang | Sheng Bi | Guilin Qi
Findings of the Association for Computational Linguistics: ACL 2025

Gradient Ascent (GA) has emerged as a promising approach for concept unlearning in Multimodal Generative Models (MGMs), such as Multimodal Large Language Models (MLLMs) and Stable Diffusion Models (SDMs). Despite its effectiveness in removing undesired knowledge, GA leads to severe utility degradation in MGMs. In this paper, we explore the mechanism behind this degradation by quantifying two distinct forms of knowledge in MGMs: (i) Conceptual Knowledge, which represents specific information about concepts; (ii) Natural Knowledge, which refers to the ability to produce coherent and logically structured outputs. Our analysis reveals that applying GA globally not only removes the targeted Conceptual Knowledge but also inadvertently diminishes Natural Knowledge, resulting in utility collapse. To address this issue, we propose Forget the Token and Pixel (FTTP), a novel approach that selectively applies GA to targeted Conceptual Knowledge while preserving Natural Knowledge through Gradient Descent (GD). FTTP eliminates the need for additional retain sets and a large number of training steps, thereby reducing computational resource costs. Extensive experiments demonstrate FTTP’s efficiency and superior utility-unlearning tradeoff for both text and image generation tasks. Our source code will be released in the near future.