Weiwen Su

Also published as: Weiwen SU

2026

Is He Extroverted? Identifying Missing Relevant Personas for Faithful User Simulation
Weiwen Su | Yuhan Zhou | Zihan Wang | Naoki Yoshinaga | Masashi Toyoda
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)

Existing user simulation approaches focus on generating user-like responses in dialogue. They often assume that the provided persona is sufficient for producing such responses, without verifying whether critical personas are supplied. This raises concerns about the validity of simulation results.To address this issue, we study the task of identifying persona dimensions (e.g., ”whether the user is price-sensitive”) that are relevant but missing in simulating a user’s reply for a given dialogue context.We introduce PICQ-drama (constructed from TVShowGuess), a benchmark of context-aware choice questions, annotated with missing persona dimensions whose absence leads to ambiguous user choices. We further design diverse evaluation criteria for missing persona identification.Benchmarking leading LLMs on our PICQ-drama dataset demonstrates the feasibility of this task. Evaluation across diverse criteria, along with further analyses, reveals cognitive differences between LLMs and humans and highlights the distinct roles of different persona categories in shaping responses.

pdf bib abs

Query-Focused Individual Simulation with Progressive Persona Completion
Weiwen SU | Naoki Yoshinaga | Masashi Toyoda
Findings of the Association for Computational Linguistics: ACL 2026

Large language models (LLMs) enable simulating individual responses from persona information, supporting applications such as opinion elicitation and virtual character creation. However, existing approaches typically assume rich persona profiles, which are often unavailable in practice. In this work, motivated by recent findings that LLMs can identify query-relevant persona dimensions (e.g., whether a user is price-sensitive), we study query-focused individual simulation under cold-start settings, where relevant persona information is identified and requested on demand for each query. To solve this task while minimizing the number of persona requests, we explore a progressive method that iteratively predicts the most critical relevant persona dimension and uses self-reported confidence as a stopping signal to determine when sufficient information has been collected. Experiments on two dialogue datasets show that this query-driven paradigm achieves simulation performance comparable to approaches that rely on rich persona information extracted from dialogue history, using only a few persona dimensions (up to five per query), and this number is further reduced by our progressive method while maintaining or improving simulation quality.

Co-authors

Venues

EACL1
Findings1

Fix author