Chunyang Gao

2026

Memory serves as a pivotal component in interactive response generation, supplying essential background information and referential knowledge for dialogues. Conventional interactive algorithms have predominantly treated memory as a merely contextual element, largely neglecting the nuanced cognitive processes involved in individualized memory encoding and retrieval. This conceptual gap has led to the prevailing schema where memory-enhanced dialogue datasets incorporate monolithic, undifferentiated memory content, failing to capture the personalized nature of persoa memory processing. Grounded in the self-reference effect from cognitive psychology, we introduce a Multi-Turn Dialogue Dataset with Personalized Contextual Memory (), establishing a comprehensive benchmark to facilitate advanced research on personalized memory processing algorithms.

pdf bib abs

Beyond Static Profiles: Capturing the Fluidity of User Preferences in Diverse Scenarios
Chunyang Gao | Yi Huang | Jingyu Yao | Xiaoting Wu | Junlan Feng
Findings of the Association for Computational Linguistics: ACL 2026

Despite the remarkable evolution of Large Language Models (LLMs) from simple assistants to versatile agents, effective personalization remains a significant challenge. Existing approaches often treat user preferences as static or merely time-varying traits, overlooking the dynamic nature of human behavior: preferences can shift, and even conflict, depending on context. To address this limitation, we propose a fine-grained taxonomy to differentiate between stable preferences, which are context-agnostic, and situational preferences, which are context-dependent. Building on this taxonomy, we introduce S2Pref, a new dataset of 10k meticulously curated entries. Each entry is grounded in a multi-turn dialogue that implicitly manifests either a stable or a situational preference, as defined by our hierarchical taxonomy. We further design three complementary evaluation tasks to benchmark LLMs on their ability to prioritize contextual signals, proactively resolve ambiguity, and efficiently infer user preferences. Our dataset and diagnostic tasks provide a practical testbed for advancing dynamic, context-aware personalization in conversational agents.

pdf bib abs

Existing user simulators based on prompting to role-play or SFT are generally confined to imitating users’ textual utterances, without adequately considering the multi-faceted cognitive processes that underlie human decision-making during interactions. To facilitate better alignment with real human thinking patterns, we construct the LMSYS-UserThinking dataset, in which we augment 51k human–LLM conversations by reconstructing the user’s inner reasoning both during and at the end of each dialogue. Furthermore, to enhance controllability and situational coherence, we introduce scenario settings that describe the global context and user goals throughout multi-turn conversations. Using this dataset, we train user simulators called ThinkingUS on different base models. We evaluate our approach from both offline and online user simulation perspectives, ultimately demonstrating its effectiveness.

Co-authors

Mengfei Guo 1

Zhe Yang 1

Venues

Findings2
ACL1

Fix author