Yifan Zhou
2026
Synapse: Empowering LLM Agents with Episodic-Semantic Memory via Spreading Activation
Hanqi Jiang | Junhao Chen | Yi Pan | Ling Chen | Weihang You | Yifan Zhou | Ruidong Zhang | Yohannes Abate | Tianming Liu
Findings of the Association for Computational Linguistics: ACL 2026
Hanqi Jiang | Junhao Chen | Yi Pan | Ling Chen | Weihang You | Yifan Zhou | Ruidong Zhang | Yohannes Abate | Tianming Liu
Findings of the Association for Computational Linguistics: ACL 2026
While Large Language Models (LLMs) excel at generalized reasoning, standard retrieval-augmented approaches fail to address the disconnected nature of long-term agentic memory. To bridge this gap, we introduce Synapse (Synergistic Associative Processing Semantic Encoding), a unified memory architecture that transcends static vector similarity. Drawing from cognitive science, Synapse models memory as a dynamic graph where relevance emerges from spreading activation rather than pre-computed links. By integrating lateral inhibition and temporal decay, the system dynamically highlights relevant sub-graphs while filtering interference. We implement a Triple Hybrid Retrieval strategy that fuses geometric embeddings with activation-based graph traversal. Extensive evaluations on the LoCoMo benchmark show that Synapse significantly outperforms state-of-the-art methods in complex temporal and multi-hop reasoning tasks, offering a robust solution to the "Contextual Tunneling" problem.
Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning
Zelin Tan | Hejia Geng | Xiaohang Yu | Mulei Zhang | Guancheng Wan | Yifan Zhou | Qiang He | Xiangyuan Xue | Heng Zhou | Yutao Fan | Zhong-Zhi Li | Zaibin Zhang | Guibin Zhang | Chen Zhang | Zhenfei Yin | Philip Torr | Lei Bai
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zelin Tan | Hejia Geng | Xiaohang Yu | Mulei Zhang | Guancheng Wan | Yifan Zhou | Qiang He | Xiangyuan Xue | Heng Zhou | Yutao Fan | Zhong-Zhi Li | Zaibin Zhang | Guibin Zhang | Chen Zhang | Zhenfei Yin | Philip Torr | Lei Bai
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While scaling laws for large language models (LLMs) during pre-training have been extensively studied, their behavior under reinforcement learning (RL) post-training remains largely unexplored. This paper investigates the scaling behavior of Large Language Model (LLM) reinforcement learning post-training, focusing on mathematical reasoning. Through experiments across the Qwen2.5 series (0.5B to 72B), we characterize how model scale, data, and compute interact. Our analysis yields four key findings: 1. Larger models consistently demonstrate superior compute and data efficiency. 2. The relationship between model performance and training resources follows a **predictive power-law** across both base and instruction-tuned models. 3. RL learning efficiency exhibits a latent **saturation trend** with increasing model scale. 4. In data-constrained regimes, performance is primarily driven by the **total volume of training data** rather than sample uniqueness. These results offer practical guidelines for scaling reasoning capabilities through reinforcement learning post-training.
2025
The Essence of Contextual Understanding in Theory of Mind: A Study on Question Answering with Story Characters
Chulun Zhou | Qiujing Wang | Mo Yu | Xiaoqian Yue | Rui Lu | Jiangnan Li | Yifan Zhou | Shunchi Zhang | Jie Zhou | Wai Lam
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Chulun Zhou | Qiujing Wang | Mo Yu | Xiaoqian Yue | Rui Lu | Jiangnan Li | Yifan Zhou | Shunchi Zhang | Jie Zhou | Wai Lam
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Theory-of-Mind (ToM) is a fundamental psychological capability that allows humans to understand and interpret the mental states of others. Humans infer others’ thoughts by integrating causal cues and indirect clues from broad contextual information, often derived from past interactions. In other words, human ToM heavily relies on the understanding about the backgrounds and life stories of others. Unfortunately, this aspect is largely overlooked in existing benchmarks for evaluating machines’ ToM capabilities, due to their usage of short narratives without global context, especially personal background of characters. In this paper, we verify the importance of comprehensive contextual understanding about personal backgrounds in ToM and assess the performance of LLMs in such complex scenarios. To achieve this, we introduce CharToM-QA benchmark, comprising 1,035 ToM questions based on characters from classic novels. Our human study reveals a significant disparity in performance: the same group of educated participants performs dramatically better when they have read the novels compared to when they have not. In parallel, our experiments on state-of-the-art LLMs, including the very recent o1 and DeepSeek-R1 models, show that LLMs still perform notably worse than humans, despite that they have seen these stories during pre-training. This highlights the limitations of current LLMs in capturing the nuanced contextual information required for ToM reasoning.
Search
Fix author
Co-authors
- Yohannes Abate 1
- Lei Bai 1
- Junhao Chen 1
- Ling Chen 1
- Yutao Fan 1
- Hejia Geng 1
- Qiang He 1
- Hanqi Jiang 1
- Wai Lam 1
- Zhong-Zhi Li 1
- Jiangnan Li 1
- Tianming Liu 1
- Rui Lu 1
- Yi Pan 1
- Zelin Tan 1
- Philip Torr 1
- Guancheng Wan 1
- Qiujing Wang 1
- Xiangyuan Xue 1
- Zhenfei Yin 1
- Weihang You 1
- Xiaohang Yu 1
- Mo Yu 1
- Xiaoqian Yue 1
- Ruidong Zhang 1
- Mulei Zhang 1
- Zaibin Zhang 1
- Guibin Zhang 1
- Chen Zhang 1
- Shunchi Zhang 1
- Heng Zhou 1
- Chulun Zhou 1
- Jie Zhou 1