Pengfei Zhang
2026
DemMA: Dementia Multi-Turn Dialogue Agent with Expert-Guided Reasoning and Action Simulation
Yutong Song | Jiang Wu | Kazi Shaharair Sharif | Pengfei Zhang | Wenjun Huang | Honghui Xu | Nikil Dutt | Amir M. Rahmani
Findings of the Association for Computational Linguistics: ACL 2026
Yutong Song | Jiang Wu | Kazi Shaharair Sharif | Pengfei Zhang | Wenjun Huang | Honghui Xu | Nikil Dutt | Amir M. Rahmani
Findings of the Association for Computational Linguistics: ACL 2026
Simulating dementia patients with large language models (LLMs) is challenging due to the need to jointly model cognitive impairment, emotional dynamics, and nonverbal behaviors over long conversations. We present DemMA, an expert-guided dementia dialogue agent for high-fidelity multi-turn patient simulation. DemMA constructs clinically grounded dementia personas by integrating pathology information, personality traits, and subtype-specific memory-status personas informed by clinical experts. To move beyond text-only simulation, DemMA explicitly models nonverbal behaviors, including motion, facial expressions, and vocal cues. We further introduce a Chain-of-Thought distillation framework that trains a single LLM to jointly generate reasoning traces, patient utterances, and aligned behavioral actions within one forward pass, enabling efficient deployment without multi-agent inference.
LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey
Henry Peng Zou | Wei-Chieh Huang | Yaozu Wu | Jizhou Guo | Yankai Chen | Chunyu Miao | Hoang H Nguyen | Yue Zhou | Weizhi Zhang | Liancheng Fang | Hanrong Zhang | Fangxin Wang | Pengfei Zhang | Langzhou He | Yangning Li | Dongyuan Li | Renhe Jiang | Philip S. Yu
Findings of the Association for Computational Linguistics: ACL 2026
Henry Peng Zou | Wei-Chieh Huang | Yaozu Wu | Jizhou Guo | Yankai Chen | Chunyu Miao | Hoang H Nguyen | Yue Zhou | Weizhi Zhang | Liancheng Fang | Hanrong Zhang | Fangxin Wang | Pengfei Zhang | Langzhou He | Yangning Li | Dongyuan Li | Renhe Jiang | Philip S. Yu
Findings of the Association for Computational Linguistics: ACL 2026
Recent advances in large language models (LLMs) have sparked growing interest in building fully autonomous agents. However, fully autonomous LLM-based agents still face significant challenges, including limited reliability due to hallucinations, difficulty in handling complex tasks, and substantial safety and ethical risks, all of which limit their feasibility and trustworthiness in real-world applications. To overcome these limitations, LLM-based human-agent systems (LLM-HAS) incorporate human-provided information, feedback, or control into the agent system to enhance system performance, reliability, and safety. These human-agent collaboration systems enable humans and LLM-based agents to collaborate effectively by leveraging their complementary strengths.This paper provides the first comprehensive and structured survey of LLM-HAS. It clarifies fundamental concepts, systematically presents core components shaping these systems, including environment and profiling, human feedback, interaction types, orchestration, and communication, explores emerging applications, and discusses unique challenges and opportunities arising from human-AI collaboration. By consolidating current knowledge and offering a structured overview, we aim to foster further research and innovation in this rapidly evolving interdisciplinary field. Paper lists and resources are available at https://github.com/HenryPengZou/Awesome-Human-Agent-Collaboration-Interaction-Systems.
2025
Towards Controllable Speech Synthesis in the Era of Large Language Models: A Systematic Survey
Tianxin Xie | Yan Rong | Pengfei Zhang | Wenwu Wang | Li Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Tianxin Xie | Yan Rong | Pengfei Zhang | Wenwu Wang | Li Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Text-to-speech (TTS) has advanced from generating natural-sounding speech to enabling fine-grained control over attributes like emotion, timbre, and style. Driven by rising industrial demand and breakthroughs in deep learning, e.g., diffusion and large language models (LLMs), controllable TTS has become a rapidly growing research area. This survey provides **the first** comprehensive review of controllable TTS methods, from traditional control techniques to emerging approaches using natural language prompts. We categorize model architectures, control strategies, and feature representations, while also summarizing challenges, datasets, and evaluations in controllable TTS. This survey aims to guide researchers and practitioners by offering a clear taxonomy and highlighting future directions in this fast-evolving field. One can visit https://github.com/imxtx/awesome-controllabe-speech-synthesis for a comprehensive paper list and updates.
Search
Fix author
Co-authors
- Yankai Chen 1
- Nikil Dutt 1
- Liancheng Fang 1
- Jizhou Guo 1
- Langzhou He 1
- Wenjun Huang 1
- Wei-Chieh Huang 1
- Renhe Jiang 1
- Yangning Li 1
- Dongyuan Li 1
- Li Liu 1
- Chunyu Miao 1
- Hoang H Nguyen 1
- Amir M. Rahmani 1
- Yan Rong 1
- Kazi Shaharair Sharif 1
- Yutong Song 1
- Fangxin Wang 1
- Wenwu Wang 1
- Jiang Wu 1
- Yaozu Wu 1
- Tianxin Xie 1
- Honghui Xu 1
- Philip S. Yu 1
- Weizhi Zhang 1
- Hanrong Zhang 1
- Yue Zhou 1
- Henry Peng Zou 1