Beyond Memorization: A Rigorous Evaluation Framework for Medical Knowledge Editing

Shigeng Chen, Linhao Luo, Zhangchi Qiu, Yanan Cao, Carl Yang, Shirui Pan


Abstract
Knowledge editing (KE) has recently emerged as a promising technique to update specific facts in large language models (LLMs) without full retraining. While existing KE methods show promising results on general-domain benchmarks, their effectiveness in the medical domain remains largely unexplored. Medical knowledge editing poses unique challenges, requiring models not only to memorize new facts but also to internalize and generalize them for reliable and interpretable clinical decision-making. In this work, we propose MedEditBench, a rigorous evaluation framework for assessing medical knowledge editing. Our preliminary results reveal that current KE paradigm, which directly edits simple answers to the LLMs, often leads to superficial updates with poor generalization. To address this, we introduce Self-Generated Rationale Editing (SGR-Edit), which leverages model-generated rationales as editing targets, enabling deeper knowledge integration. Extensive experiments across diverse LLMs and KE methods demonstrate that SGR-Edit consistently improves editing efficacy and generalization. Furthermore, we examine the impact of sequential edits on in-domain medical knowledge, external-domain knowledge, as well as general model capabilities, offering practical insights for deploying KE in real-world medical applications.
Anthology ID:
2026.eacl-long.219
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4727–4751
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.219/
DOI:
Bibkey:
Cite (ACL):
Shigeng Chen, Linhao Luo, Zhangchi Qiu, Yanan Cao, Carl Yang, and Shirui Pan. 2026. Beyond Memorization: A Rigorous Evaluation Framework for Medical Knowledge Editing. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4727–4751, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Beyond Memorization: A Rigorous Evaluation Framework for Medical Knowledge Editing (Chen et al., EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.219.pdf