Shigeng Chen


2026

Knowledge editing (KE) has recently emerged as a promising technique to update specific facts in large language models (LLMs) without full retraining. While existing KE methods show promising results on general-domain benchmarks, their effectiveness in the medical domain remains largely unexplored. Medical knowledge editing poses unique challenges, requiring models not only to memorize new facts but also to internalize and generalize them for reliable and interpretable clinical decision-making. In this work, we propose MedEditBench, a rigorous evaluation framework for assessing medical knowledge editing. Our preliminary results reveal that current KE paradigm, which directly edits simple answers to the LLMs, often leads to superficial updates with poor generalization. To address this, we introduce Self-Generated Rationale Editing (SGR-Edit), which leverages model-generated rationales as editing targets, enabling deeper knowledge integration. Extensive experiments across diverse LLMs and KE methods demonstrate that SGR-Edit consistently improves editing efficacy and generalization. Furthermore, we examine the impact of sequential edits on in-domain medical knowledge, external-domain knowledge, as well as general model capabilities, offering practical insights for deploying KE in real-world medical applications.