Maiqi Jiang


2026

Although large language models (LLMs) excel at factual recall, they can still propagate stale or incorrect knowledge, making in-context knowledge editing a gradient-free remedy suitable for black-box APIs. These knowledge editors that use in-context learning typically rely on a single retriever and surface-similarity heuristics to build prompts. However, a key observation in this study is that retrievers can be complementary: semantic rankers may recover paraphrased evidence, while lexical or feature-based retrievers may preserve precise entities and cues. This creates two gaps in single-retriever editors: they (i) miss complementary evidence that different retrievers surface and (ii) cannot adapt when one retriever is clearly more reliable for a query. We introduce a Feature-Weighted Ensemble for In-context Knowledge Editing (FWE-IKE) that calibrates three heterogeneous rankers (LLM-, BERT-, and MLP-based), extracts simple confidence features from each ranker, predicts per-query mixture weights, and applies a conservative margin-based routing gate that selects a single expert when confident; otherwise we mix calibrated distributions with learned per-query weights. On the CounterFact benchmark, FWE-IKE attains 88.33% Edit-Success Rate, a +3.0 point gain over the best single retriever and approaching the oracle upper bound (91%). Case studies, an ablation study, and analyses show the method systematically recovers complementary wins (e.g., BERT-only, LLM-only, MLP-only slices). FWE-IKE improves edit accuracy without touching model weights and provides a practical path to more robust, confidence-aware retrieval for IKE.

2025

Large language models (LLMs) excel at factual recall yet still propagate stale or incorrect knowledge. In‐context knowledge editing offers a gradient-free remedy suitable for black-box APIs, but current editors rely on static demonstration sets chosen by surface-level similarity, leading to two persistent obstacles: (i) a quantity–quality trade-off, and (ii) lack of adaptivity to task difficulty. We address these issues by dynamically selecting supporting demonstrations according to their utility for the edit. We propose **D**ynamic **R**etriever for **I**n-Context **K**nowledge **E**diting (DR-IKE), a lightweight framework that (1) trains a BERT retriever with REINFORCE to rank demonstrations by editing reward, and (2) employs a *learnable threshold σ* to prune low-value examples, shortening the prompt when the edit is easy and expanding it when the task is hard. DR-IKE performs editing without modifying model weights, relying solely on forward passes for compatibility with black-box LLMs. On the CounterFact benchmark, it improves edit success by up to 17.1%, reduces latency by 41.6%, and preserves accuracy on unrelated queries—demonstrating scalable and adaptive knowledge editing.