Correcting Language Model Outputs by Editing Salient Layers
Kshitij Mishra, Tamer Soliman, Anil Ramakrishna, Aram Galstyan, Anoop Kumar
Abstract
Large language models can accumulate incorrect or outdated knowledge as the real world evolves. Compared to typical solutions such as retraining, retrieval augmented generation, model editing offers an effective yet low cost solution to address this issue. However, existing model editing algorithms employ manual selection of edit layers, which requires prior domain knowledge or expensive architecture-specific empirical layer selection methods, such as causal tracing. In this work, we propose SaLEM (Salient Layers Editing Model), an efficient solution for data driven layer selection for the model editing task. Our solution utilizes layer-wise saliency maps for layer selection, and matches the accuracy of prior approaches but with only 1/3 of their edits, enabling efficient updates to the parametric knowledge in large language models.- Anthology ID:
- 2024.findings-eacl.86
- Volume:
- Findings of the Association for Computational Linguistics: EACL 2024
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian’s, Malta
- Editors:
- Yvette Graham, Matthew Purver
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1295–1305
- Language:
- URL:
- https://aclanthology.org/2024.findings-eacl.86
- DOI:
- Cite (ACL):
- Kshitij Mishra, Tamer Soliman, Anil Ramakrishna, Aram Galstyan, and Anoop Kumar. 2024. Correcting Language Model Outputs by Editing Salient Layers. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1295–1305, St. Julian’s, Malta. Association for Computational Linguistics.
- Cite (Informal):
- Correcting Language Model Outputs by Editing Salient Layers (Mishra et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.findings-eacl.86.pdf