Xun Yang
2025
MultiAgentESC: A LLM-based Multi-Agent Collaboration Framework for Emotional Support Conversation
Yangyang Xu
|
Jinpeng Hu
|
Zhuoer Zhao
|
Zhangling Duan
|
Xiao Sun
|
Xun Yang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
The development of Emotional Support Conversation (ESC) systems is critical for delivering mental health support tailored to the needs of help-seekers. Recent advances in large language models (LLMs) have contributed to progress in this domain, while most existing studies focus on generating responses directly and overlook the integration of domain-specific reasoning and expert interaction.Therefore, in this paper, we propose a training-free Multi-Agent collaboration framework for ESC (MultiAgentESC).The framework is designed to emulate the human-like process of providing emotional support through three stages: dialogue analysis, strategy deliberation, and response generation.At each stage, a multi-agent system is employed to iteratively enhance information understanding and reasoning, simulating real-world decision-making processes by incorporating diverse interactions among these expert agents.Additionally, we introduce a novel response-centered approach to handle the one-to-many problem on strategy selection, where multiple valid strategies are initially employed to generate diverse responses, followed by the selection of the optimal response through multi-agent collaboration.Experiments on the ESConv dataset reveal that our proposed framework excels at providing emotional support as well as diversifying support strategy selection.
2024
Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers
Haowen Pan
|
Yixin Cao
|
Xiaozhi Wang
|
Xun Yang
|
Meng Wang
Findings of the Association for Computational Linguistics: ACL 2024
Understanding the internal mechanisms by which multi-modal large language models (LLMs) interpret different modalities and integrate cross-modal representations is becoming increasingly critical for continuous improvements in both academia and industry. In this paper, we propose a novel method to identify key neurons for interpretability — how multi-modal LLMs bridge visual and textual concepts for captioning. Our method improves conventional works upon efficiency and applied range by removing needs of costly gradient computation. Based on those identified neurons, we further design a multi-modal knowledge editing method, beneficial to mitigate sensitive words or hallucination. For rationale of our design, we provide theoretical assumption. For empirical evaluation, we have conducted extensive quantitative and qualitative experiments. The results not only validate the effectiveness of our methods, but also offer insightful findings that highlight three key properties of multi-modal neurons: sensitivity, specificity and causal-effect, to shed light for future research.