Yangfan Wang
2026
HiEdit: Lifelong Model Editing with Hierarchical Reinforcement Learning
Yangfan Wang | Tianyang Sun | Chen Tang | Jie Liu | Wei Cai | Jingchi Jiang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yangfan Wang | Tianyang Sun | Chen Tang | Jie Liu | Wei Cai | Jingchi Jiang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Lifelong model editing (LME) aims to sequentially rectify outdated or inaccurate knowledge in deployed LLMs while minimizing side effects on unrelated inputs. However, existing approaches typically apply parameter perturbations to a static and dense set of LLM layers for all editing instances. This practice is counter-intuitive, as we hypothesize that different pieces of knowledge are stored in distinct layers of the model. Neglecting this layer-wise specificity can impede adaptability in integrating new knowledge and result in catastrophic forgetting for both general and previously edited knowledge. To address this, we propose HiEdit, a hierarchical reinforcement learning framework that adaptively identifies the most knowledge-relevant layers for each editing instance. By enabling dynamic, instance-aware layer selection and incorporating an intrinsic reward for sparsity, HiEdit achieves precise, localized updates. Experiments on various LLMs show that HiEdit boosts the performance of the competitive RLEdit by an average of 8.48% with perturbing only half of the layers per edit. Our code is available at: https://github.com/yangfanww/hiedit.
2025
KCS: Diversify Multi-hop Question Generation with Knowledge Composition Sampling
Yangfan Wang | Jie Liu | Chen Tang | Lian Yan | Jingchi Jiang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Yangfan Wang | Jie Liu | Chen Tang | Lian Yan | Jingchi Jiang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Multi-hop question answering faces substantial challenges due to data sparsity, which increases the likelihood of language models learning spurious patterns. To address this issue, prior research has focused on diversifying question generation through content planning and varied expression. However, these approaches often emphasize generating simple questions and neglect the integration of essential knowledge, such as relevant sentences within documents. This paper introduces the **Knowledge Composition Sampling (KCS)**, an innovative framework designed to expand the diversity of generated multi-hop questions by sampling varied knowledge compositions within a given context. KCS models the knowledge composition selection as a sentence-level conditional prediction task and utilizes a probabilistic contrastive loss to predict the next most relevant piece of knowledge. During inference, we employ a stochastic decoding strategy to effectively balance accuracy and diversity. Compared to competitive baselines, our KCS improves the overall accuracy of knowledge composition selection by 3.9%, and its application for data augmentation yields improvements on HotpotQA and 2WikiMultihopQA datasets. Our code is available at: https://github.com/yangfanww/kcs.