Yufei Zhang
2026
Beyond Dialogue Time: Temporal Semantic Memory for Personalized LLM Agents
Miao Su | Yucan Guo | Zhongni Hou | Long Bai | Zixuan Li | Yufei Zhang | Guojun Yin | Wei Lin | Xiaolong Jin | Jiafeng Guo | Xueqi Cheng
Findings of the Association for Computational Linguistics: ACL 2026
Miao Su | Yucan Guo | Zhongni Hou | Long Bai | Zixuan Li | Yufei Zhang | Guojun Yin | Wei Lin | Xiaolong Jin | Jiafeng Guo | Xueqi Cheng
Findings of the Association for Computational Linguistics: ACL 2026
Memory enables Large Language Model (LLM) agents to perceive, store, and use information from past dialogues, which is essential for personalization. However, existing methods fail to properly model the temporal dimension of memory in two aspects: 1) Temporal inaccuracy: memories are organized by dialogue time rather than their actual occurrence time; 2) Temporal fragmentation: existing methods focus on point-wise memory, losing durative information that captures persistent states and evolving patterns. To address these limitations, we propose Temporal Semantic Memory (TSM), a memory framework that models semantic time for point-wise memory and supports the construction and utilization of durative memory. During memory construction, it first builds a semantic timeline rather than a dialogue one. Then, it consolidates temporally continuous and semantically related information into a durative memory. During memory utilization, it incorporates the query’s temporal intent on the semantic timeline, enabling the retrieval of temporally appropriate durative memories and providing time-valid, duration-consistent context to support response generation. Experiments on LongMemEval and LoCoMo show that TSM consistently outperforms existing methods and achieves up to 12.2% absolute improvement in accuracy, demonstrating the effectiveness of the proposed method.
2025
Beyond Static Testbeds: An Interaction-Centric Agent Simulation Platform for Dynamic Recommender Systems
Song Jin | Juntian Zhang | Yuhan Liu | Xun Zhang | Yufei Zhang | Guojun Yin | Fei Jiang | Wei Lin | Rui Yan
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Song Jin | Juntian Zhang | Yuhan Liu | Xun Zhang | Yufei Zhang | Guojun Yin | Fei Jiang | Wei Lin | Rui Yan
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Evaluating and iterating upon recommender systems is crucial, yet traditional A/B testing is resource-intensive, and offline methods struggle with dynamic user-platform interactions. While agent-based simulation is promising, existing platforms often lack a mechanism for user actions to dynamically reshape the environment. To bridge this gap, we introduce RecInter , a novel agent-based simulation platform for recommender systems featuring a robust interaction mechanism. In RecInter platform, simulated user actions (e.g., likes, reviews, purchases) dynamically update item attributes in real-time, and introduced Merchant Agents can reply, fostering a more realistic and evolving ecosystem. High-fidelity simulation is ensured through Multidimensional User Profiling module, Advanced Agent Architecture, and LLM fine-tuned on Chain-of-Thought (CoT) enriched interaction data. Our platform achieves significantly improved simulation credibility and successfully replicates emergent phenomena like Brand Loyalty and the Matthew Effect. Experiments demonstrate that this interaction mechanism is pivotal for simulating realistic system evolution, establishing our platform as a credible testbed for recommender systems research. All codes are released in https://github.com/jinsong8/RecInter.
Com2 : A Causal-Guided Benchmark for Exploring Complex Commonsense Reasoning in Large Language Models
Kai Xiong | Xiao Ding | Yixin Cao | Yuxiong Yan | Li Du | Yufei Zhang | Jinglong Gao | Jiaqian Liu | Bing Qin | Ting Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Kai Xiong | Xiao Ding | Yixin Cao | Yuxiong Yan | Li Du | Yufei Zhang | Jinglong Gao | Jiaqian Liu | Bing Qin | Ting Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) have mastered abundant simple and explicit commonsense knowledge through pre-training, enabling them to achieve human-like performance in simple commonsense reasoning. Nevertheless, LLMs struggle to reason with complex and implicit commonsense knowledge that is derived from simple ones (such as understanding the long-term effects of certain events), an aspect humans tend to focus on more. Existing works focus on complex tasks like math and code, while complex commonsense reasoning remains underexplored due to its uncertainty and lack of structure. To fill this gap and align with real-world concerns, we propose a benchmark Com2 focusing on complex commonsense reasoning. We first incorporate causal event graphs to serve as structured complex commonsense. Then we adopt causal theory (e.g., intervention) to modify the causal event graphs and obtain different scenarios that meet human concerns. Finally, an LLM is employed to synthesize examples with slow thinking, which is guided by the logical relationships in the modified causal graphs. Furthermore, we use detective stories to construct a more challenging subset. Experiments show that LLMs struggle in reasoning depth and breadth, while post-training and slow thinking can alleviate this. The code and data are available at https://github.com/Waste-Wood/Com2.
UIOrchestra: Generating High-Fidelity Code from UI Designs with a Multi-agent System
Chuhuai Yue | Jiajun Chai | Yufei Zhang | Zixiang Ding | Xihao Liang | Peixin Wang | Shihai Chen | Wang Yixuan | Wangyanping | Guojun Yin | Wei Lin
Findings of the Association for Computational Linguistics: EMNLP 2025
Chuhuai Yue | Jiajun Chai | Yufei Zhang | Zixiang Ding | Xihao Liang | Peixin Wang | Shihai Chen | Wang Yixuan | Wangyanping | Guojun Yin | Wei Lin
Findings of the Association for Computational Linguistics: EMNLP 2025
Recent advances in large language models (LLMs) have significantly improved automated code generation, enabling tools such as GitHub Copilot and CodeWhisperer to assist developers in a wide range of programming tasks. However, the translation of complex mobile UI designs into high-fidelity front-end code remains a challenging and underexplored area, especially as modern app interfaces become increasingly intricate. In this work, we propose UIOrchestra, a collaborative multi-agent system designed for the AppUI2Code task, which aims to reconstruct static single-page applications from design mockups. UIOrchestra integrates three specialized agents, layout description, code generation, and difference analysis agent that work collaboratively to address the limitations of single-model approaches. To facilitate robust evaluation, we introduce APPUI, the first benchmark dataset for AppUI2Code, constructed through a human-in-the-loop process to ensure data quality and coverage. Experimental results demonstrate that UIOrchestra outperforms existing methods in reconstructing complex app pages and highlight the necessity of multi-agent collaboration for this task. We hope our work will inspire further research on leveraging LLMs for front-end automation. The code and data will be released upon paper acceptance.
Search
Fix author
Co-authors
- Wei Lin 3
- Guojun Yin 3
- Long Bai 1
- Yixin Cao 1
- Jiajun Chai 1
- Shihai Chen 1
- Xueqi Cheng (程学旗) 1
- Xiao Ding 1
- Zixiang Ding 1
- Li Du 1
- Jinglong Gao 1
- Yucan Guo 1
- Jiafeng Guo (嘉丰 郭) 1
- Zhongni Hou 1
- Fei Jiang 1
- Song Jin 1
- Xiaolong Jin 1
- Zixuan Li 1
- Xihao Liang 1
- Yuhan Liu 1
- Jiaqian Liu 1
- Ting Liu 1
- Bing Qin (秦兵) 1
- Miao Su 1
- Peixin Wang 1
- Wangyanping 1
- Kai Xiong 1
- Rui Yan 1
- Yuxiong Yan 1
- Wang Yixuan 1
- Chuhuai Yue 1
- Juntian Zhang 1
- Xun Zhang 1