Yingyu Shan
2026
Mem2Evolve: Towards Self-Evolving Agents via Co-Evolutionary Capability Expansion and Experience Distillation
Zihao Cheng | Zeming Liu | Yingyu Shan | Xinyi Wang | Xiangrong Zhu | Yunpu Ma | Hongru Wang | Yuhang Guo | Wei Lin | Yunhong Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zihao Cheng | Zeming Liu | Yingyu Shan | Xinyi Wang | Xiangrong Zhu | Yunpu Ma | Hongru Wang | Yuhang Guo | Wei Lin | Yunhong Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While large language model–powered agents can self-evolve by accumulating experience or by dynamically creating new assets (i.e., tools or expert agents), existing frameworks typically treat these two evolutionary processes in isolation. This separation overlooks their intrinsic interdependence: the former is inherently bounded by a manually predefined static toolset, while the latter generates new assets from scratch without experiential guidance, leading to limited capability growth and unstable evolution. To address this limitation, we introduce a novel paradigm of co-evolutionary Capability Expansion and Experience Distillation. Guided by this paradigm, we propose the **Mem2Evolve**, which integrates two core components: **Experience Memory** and **Asset Memory**. Specifically, Mem2Evolve leverages accumulated experience to guide the dynamic creation of assets, thereby expanding the agent’s capability space while simultaneously acquiring new experience to achieve co-evolution. Extensive experiments across 6 task categories and 8 benchmarks demonstrate that Mem2Evolve achieves improvement of 18.53% over standard LLMs, 11.80% over agents evolving solely through experience, and 6.46% over those evolving solely through asset creation, establishing it as a substantially more effective and stable self-evolving agent framework.
2024
FAME: Towards Factual Multi-Task Model Editing
Li Zeng | Yingyu Shan | Zeming Liu | Jiashu Yao | Yuhang Guo
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Li Zeng | Yingyu Shan | Zeming Liu | Jiashu Yao | Yuhang Guo
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Large language models (LLMs) embed extensive knowledge and utilize it to perform exceptionally well across various tasks. Nevertheless, outdated knowledge or factual errors within LLMs can lead to misleading or incorrect responses, causing significant issues in practical applications. To rectify the fatal flaw without the necessity for costly model retraining, various model editing approaches have been proposed to correct inaccurate information within LLMs in a cost-efficient way. To evaluate these model editing methods, previous work introduced a series of datasets. However, most of the previous datasets only contain fabricated data in a single format, which diverges from real-world model editing scenarios, raising doubts about their usability in practice. To facilitate the application of model editing in real-world scenarios, we propose the challenge of practicality. To resolve such challenges and effectively enhance the capabilities of LLMs, we present FAME, an authentic, comprehensive, and multi-task dataset, which is designed to enhance the practicality of model editing. We then propose SKEME, a model editing method that uses a novel caching mechanism to ensure synchronization with the real world. The experiments demonstrate that our method performs excellently across various tasks and scenarios, confirming its practicality.
2023
BIT-ACT: An Ancient Chinese Translation System Using Data Augmentation
Li Zeng | Yanzhi Tian | Yingyu Shan | Yuhang Guo
Proceedings of ALT2023: Ancient Language Translation Workshop
Li Zeng | Yanzhi Tian | Yingyu Shan | Yuhang Guo
Proceedings of ALT2023: Ancient Language Translation Workshop
This paper describes a translation model for ancient Chinese to modern Chinese and English for the Evahan 2023 competition, a subtask of the Ancient Language Translation 2023 challenge. During the training of our model, we applied various data augmentation techniques and used SiKu-RoBERTa as part of our model architecture. The results indicate that back translation improves the model’s performance, but double back translation introduces noise and harms the model’s performance. Fine-tuning on the original dataset can be helpful in solving the issue.