Yankai Chen
Other people with similar names: Yankai Chen
Unverified author pages with similar names: Yankai Chen
2026
GAM: Hierarchical Graph-based Agentic Memory for LLM Agents
Zhaofen Wu | Hanrong Zhang | Fulin Lin | Wujiang Xu | Xinran Xu | Yankai Chen | Henry Peng Zou | Shaowen Chen | Weizhi Zhang | Xue Liu | Philip S. Yu | Hongwei Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhaofen Wu | Hanrong Zhang | Fulin Lin | Wujiang Xu | Xinran Xu | Yankai Chen | Henry Peng Zou | Shaowen Chen | Weizhi Zhang | Xue Liu | Philip S. Yu | Hongwei Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
To sustain coherent long-term interactions, Large Language Model (LLM) agents must navigate the tension between acquiring new information and retaining prior knowledge. Current unified stream-based memory systems facilitate context updates but remain vulnerable to interference from transient noise. Conversely, discrete structured memory architectures provide robust knowledge retention but often struggle to adapt to fluid narrative evolution. To address this, we propose GAM, a hierarchical Graph-based Agentic Memory framework that explicitly decouples memory encoding from consolidation to effectively resolve the conflict between rapid context perception and stable knowledge retention. By isolating ongoing dialogue in a event progression graph and integrating it into a topic associative network only upon semantic shifts, our approach minimizes interference while preserving long-term consistency. Additionally, we introduce a Graph-guided, Multi-factor Retrieval strategy to enhance context precision. Experiments on LoCoMo and LongDialQA benchmarks indicate that our method consistently outperforms state-of-the-art baselines in both reasoning accuracy and computational efficiency.
Deep Research with Open-Domain Evaluation and Multi-Stage Guardrails for Safety
Wei-Chieh Huang | Henry Peng Zou | Yaozu Wu | Dongyuan Li | Yankai Chen | Weizhi Zhang | Yangning Li | Angelo Zangari | Jizhou Guo | Chunyu Miao | Liancheng Fang | Langzhou He | Yinghui Li | Renhe Jiang | Philip S. Yu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Wei-Chieh Huang | Henry Peng Zou | Yaozu Wu | Dongyuan Li | Yankai Chen | Weizhi Zhang | Yangning Li | Angelo Zangari | Jizhou Guo | Chunyu Miao | Liancheng Fang | Langzhou He | Yinghui Li | Renhe Jiang | Philip S. Yu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Deep research frameworks have shown promising capabilities in synthesizing comprehensive reports from web sources. While deep research possesses significant potential to address complex issues through planning and research cycles, existing frameworks are deficient in sufficient evaluation procedures and stage-specific protections. They typically treat evaluation as exact match accuracy of question-answering, but overlook crucial aspects of report quality such as credibility, coherence, breadth, depth, and safety. This oversight may result in hazardous or malicious sources being integrated into the final report. To address this, we introduce DeepResearchGuard, a framework featuring four-stage safeguards with open-domain evaluation, and DRSafeBench, a novel stage-wise safety benchmark. Evaluating across GPT-4o, o4-mini, Gemini-2.5-flash, DeepSeek-v3, and GPT-5, DeepResearchGuard improves defense success rates by an absolute 16.53% while reducing over-refusal rates to approximately 6%. Through extensive experiments, we show that DeepResearchGuard enables comprehensive open-domain evaluation and stage-aware defenses that effectively block harmful content propagation, while systematically improving report quality without excessive over-refusal rates.
LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey
Henry Peng Zou | Wei-Chieh Huang | Yaozu Wu | Jizhou Guo | Yankai Chen | Chunyu Miao | Hoang H Nguyen | Yue Zhou | Weizhi Zhang | Liancheng Fang | Hanrong Zhang | Fangxin Wang | Pengfei Zhang | Langzhou He | Yangning Li | Dongyuan Li | Renhe Jiang | Philip S. Yu
Findings of the Association for Computational Linguistics: ACL 2026
Henry Peng Zou | Wei-Chieh Huang | Yaozu Wu | Jizhou Guo | Yankai Chen | Chunyu Miao | Hoang H Nguyen | Yue Zhou | Weizhi Zhang | Liancheng Fang | Hanrong Zhang | Fangxin Wang | Pengfei Zhang | Langzhou He | Yangning Li | Dongyuan Li | Renhe Jiang | Philip S. Yu
Findings of the Association for Computational Linguistics: ACL 2026
Recent advances in large language models (LLMs) have sparked growing interest in building fully autonomous agents. However, fully autonomous LLM-based agents still face significant challenges, including limited reliability due to hallucinations, difficulty in handling complex tasks, and substantial safety and ethical risks, all of which limit their feasibility and trustworthiness in real-world applications. To overcome these limitations, LLM-based human-agent systems (LLM-HAS) incorporate human-provided information, feedback, or control into the agent system to enhance system performance, reliability, and safety. These human-agent collaboration systems enable humans and LLM-based agents to collaborate effectively by leveraging their complementary strengths.This paper provides the first comprehensive and structured survey of LLM-HAS. It clarifies fundamental concepts, systematically presents core components shaping these systems, including environment and profiling, human feedback, interaction types, orchestration, and communication, explores emerging applications, and discusses unique challenges and opportunities arising from human-AI collaboration. By consolidating current knowledge and offering a structured overview, we aim to foster further research and innovation in this rapidly evolving interdisciplinary field. Paper lists and resources are available at https://github.com/HenryPengZou/Awesome-Human-Agent-Collaboration-Interaction-Systems.
Many-Shot Scaling of In-Context Learning with Self-Generated Demonstrations
Zhengyao Gu | Henry Peng Zou | Yankai Chen | Aiwei Liu | Weizhi Zhang | Philip S. Yu
Findings of the Association for Computational Linguistics: ACL 2026
Zhengyao Gu | Henry Peng Zou | Yankai Chen | Aiwei Liu | Weizhi Zhang | Philip S. Yu
Findings of the Association for Computational Linguistics: ACL 2026
The high cost of obtaining high-quality annotated data for in-context learning (ICL) has motivated the development of methods that use self-generated annotations in place of ground truth labels. While these approaches have shown promising results in few-shot settings, they generally do not scale to many-shot scenarios. In this work, we study ICL with self-generated examples using a framework analogous to traditional semi-supervised learning, consisting of annotation generation, demonstration selection, and in-context inference. Within this framework, we propose a simple baseline that outperforms ground truth ICL under zero-shot, few-shot, and many-shot settings. Notably, we observe consistent scaling behaviors with respect to the number of self-annotated demonstrations. To further extract performance from this many-shot capability, we introduce IterPSD, an iterative self-annotation approach that integrates iterative refinement and curriculum pseudo-labeling techniques from semi-supervised learning, yielding up to 6.8% additional gains on classification tasks. Motivated by our baseline and IterPSD results, we demonstrate that semi-supervised ICL offers a promising avenue for future ICL research.
2025
Teaching According to Talents! Instruction Tuning LLMs with Competence-Aware Curriculum Learning
Yangning Li | Tingwei Lu | Yinghui Li | Yankai Chen | Wei-Chieh Huang | Wenhao Jiang | Hui Wang | Hai-Tao Zheng | Philip S. Yu
Findings of the Association for Computational Linguistics: EMNLP 2025
Yangning Li | Tingwei Lu | Yinghui Li | Yankai Chen | Wei-Chieh Huang | Wenhao Jiang | Hui Wang | Hai-Tao Zheng | Philip S. Yu
Findings of the Association for Computational Linguistics: EMNLP 2025
Efficient instruction tuning aims to enhance the ultimate performance of large language models (LLMs) trained on a given instruction dataset. Curriculum learning as a typical data organization strategy has shown preliminary effectiveness in instruction tuning. However, current curriculum tuning methods suffer from the curriculum rigidity, since they rely solely on static heuristic difficulty metrics. These methods fail to adapt to the evolving capabilities of models during training, resulting in a fixed and potentially sub-optimal learning trajectory. To address the issue, **C**ompetence-**A**ware **M**ulti-**P**erspective c**U**rriculum in**S**truction tuning framework termed **CAMPUS** is proposed. CAMPUS offers several advantages: (1) Dynamic selection for sub-curriculum. (2) Competency-aware adjustment to the curriculum schedule. (3) Multiple difficulty-based scheduling. Extensive experiments prove the superior performance of CAMPUS, compared to other state-of-the-art baselines for efficient instruction tuning.
A Survey of RAG-Reasoning Systems in Large Language Models
Yangning Li | Weizhi Zhang | Yuyao Yang | Wei-Chieh Huang | Yaozu Wu | Junyu Luo | Yuanchen Bei | Henry Peng Zou | Xiao Luo | Yusheng Zhao | Chunkit Chan | Yankai Chen | Zhongfen Deng | Yinghui Li | Hai-Tao Zheng | Dongyuan Li | Renhe Jiang | Ming Zhang | Yangqiu Song | Philip S. Yu
Findings of the Association for Computational Linguistics: EMNLP 2025
Yangning Li | Weizhi Zhang | Yuyao Yang | Wei-Chieh Huang | Yaozu Wu | Junyu Luo | Yuanchen Bei | Henry Peng Zou | Xiao Luo | Yusheng Zhao | Chunkit Chan | Yankai Chen | Zhongfen Deng | Yinghui Li | Hai-Tao Zheng | Dongyuan Li | Renhe Jiang | Ming Zhang | Yangqiu Song | Philip S. Yu
Findings of the Association for Computational Linguistics: EMNLP 2025
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge, yet it falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts. This survey synthesizes both strands under a unified reasoning-search perspective. We first map how advanced reasoning optimizes each stage of RAG (Reasoning-Enhanced RAG). Then, we show how retrieved knowledge of different type supply missing premises and expand context for complex inference (RAG-Enhanced Reasoning). Finally, we spotlight emerging Synergized RAG-Reasoning frameworks, where (agentic) LLMs iteratively interleave search and thought to achieve state-of-the-art performance across knowledge-intensive benchmarks. We categorize methods, datasets, and open challenges, and outline research avenues toward deeper RAG-Reasoning systems that are more effective, multimodally-adaptive, trustworthy, and human-centric.
Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances, Resources, and Future Directions
Yaozu Wu | Dongyuan Li | Yankai Chen | Renhe Jiang | Henry Peng Zou | Wei-Chieh Huang | Yangning Li | Liancheng Fang | Zhen Wang | Philip S. Yu
Findings of the Association for Computational Linguistics: EMNLP 2025
Yaozu Wu | Dongyuan Li | Yankai Chen | Renhe Jiang | Henry Peng Zou | Wei-Chieh Huang | Yangning Li | Liancheng Fang | Zhen Wang | Philip S. Yu
Findings of the Association for Computational Linguistics: EMNLP 2025
Autonomous Driving Systems (ADSs) are revolutionizing transportation by reducing human intervention, improving operational efficiency, and enhancing safety. Large Language Models (LLMs), known for their exceptional planning and reasoning capabilities, have been integrated into ADSs to assist with driving decision-making. However, LLM-based single-agent ADSs face three major challenges: limited perception, insufficient collaboration, and high computational demands. To address these issues, recent advancements in LLM-based multi-agent ADSs have focused on improving inter-agent communication and cooperation. This paper provides a frontier survey of LLM-based multi-agent ADSs. We begin with a background introduction to related concepts, followed by a categorization of existing LLM-based approaches based on different agent interaction modes. We then discuss agent-human interactions in scenarios where LLM-based agents engage with humans. Finally, we summarize key applications, datasets, and challenges in this field to support future research (https://github.com/Yaozuwu/LLM-based_Multi-agent_ADS).
TestNUC: Enhancing Test-Time Computing Approaches and Scaling through Neighboring Unlabeled Data Consistency
Henry Peng Zou | Zhengyao Gu | Yue Zhou | Yankai Chen | Weizhi Zhang | Liancheng Fang | Yibo Wang | Yangning Li | Kay Liu | Philip S. Yu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Henry Peng Zou | Zhengyao Gu | Yue Zhou | Yankai Chen | Weizhi Zhang | Liancheng Fang | Yibo Wang | Yangning Li | Kay Liu | Philip S. Yu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Test-time computing approaches, which leverage additional computational resources during inference, have been proven effective in enhancing large language model performance. This work introduces a novel, linearly scaling approach, TestNUC, that improves test-time predictions by leveraging the local consistency of neighboring unlabeled data-it classifies an input instance by considering not only the model’s prediction on that instance but also on neighboring unlabeled instances. We evaluate TestNUC across eight diverse datasets, spanning intent classification, topic mining, domain discovery, and emotion detection, demonstrating its consistent superiority over baseline methods such as standard prompting and self-consistency. Furthermore, TestNUC can be seamlessly integrated with existing test-time computing approaches, substantially boosting their performance. Our analysis reveals that TestNUC scales effectively with increasing amounts of unlabeled data and performs robustly across different embedding models, making it practical for real-world applications. Our code is available at https://github.com/HenryPengZou/TestNUC.
Search
Fix author
Co-authors
- Philip S. Yu 8
- Henry Peng Zou 7
- Yangning Li 6
- Weizhi Zhang 6
- Wei-Chieh Huang 5
- Liancheng Fang 4
- Renhe Jiang 4
- Dongyuan Li 4
- Yaozu Wu 4
- Yinghui Li 3
- Zhengyao Gu 2
- Jizhou Guo 2
- Langzhou He 2
- Chunyu Miao 2
- Hanrong Zhang 2
- Hai-Tao Zheng 2
- Yue Zhou 2
- Yuanchen Bei 1
- Chunkit Chan 1
- Shaowen Chen 1
- Zhongfen Deng 1
- Wenhao Jiang 1
- Fulin Lin 1
- Xue Liu 1
- Aiwei Liu 1
- Kay Liu 1
- Tingwei Lu 1
- Junyu Luo 1
- Xiao Luo 1
- Hoang H Nguyen 1
- Yangqiu Song 1
- Hongwei Wang 1
- Hui Wang 1
- Fangxin Wang 1
- Zhen Wang 1
- Yibo Wang 1
- Zhaofen Wu 1
- Wujiang Xu 1
- Xinran Xu 1
- Yuyao Yang 1
- Angelo Zangari 1
- Ming Zhang 1
- Pengfei Zhang 1
- Yusheng Zhao 1