Initializing and Retrofitting Key-Value Adaptors for Traceable Model Editing

Hanlun Zhu, Yunshi Lan, Xiang Li, Weining Qian


Abstract
As the insight of knowledge storage in language models deepens, the ability to perform CRUD (Create, Read, Update, Delete) operations on language models becomes increasingly indispensable for satisfying the demands of managing rapidly updating knowledge. Considering the high cost of fine-tuning language models, model editing methods with low cost are usually required to manipulate models’ knowledge. The evidence suggests that modules carrying knowledge in a Transformer module are primarily the MLP blocks, thus we propose iReVa, a method that explicitly initializes and retrofits key-value pairs into MLP blocks to construct a new mapping of a piece of knowledge without damaging the irrelevant knowledge. In comparison to existing methods, iReVa reveals better interpretability and a stronger capacity for carrying traceable edits. Experiment results on a series of GPT series models show our prominent performance on edit success and generalization without influencing specificity. We also made the first attempt to conduct a knowledge withdrawal test of iReVa. Our codes are available at https://github.com/timberflow/iReVa.
Anthology ID:
2025.findings-acl.152
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:
Findings | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2958–2971
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.152/
DOI:
Bibkey:
Cite (ACL):
Hanlun Zhu, Yunshi Lan, Xiang Li, and Weining Qian. 2025. Initializing and Retrofitting Key-Value Adaptors for Traceable Model Editing. In Findings of the Association for Computational Linguistics: ACL 2025, pages 2958–2971, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Initializing and Retrofitting Key-Value Adaptors for Traceable Model Editing (Zhu et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.152.pdf