SocialBench: Sociality Evaluation of Role-Playing Conversational Agents
Hongzhan Chen, Hehong Chen, Ming Yan, Wenshen Xu, Gao Xing, Weizhou Shen, Xiaojun Quan, Chenliang Li, Ji Zhang, Fei Huang
Abstract
Large language models (LLMs) have advanced the development of various AI conversational agents, including role-playing agents that mimic diverse characters and human behaviors. While prior research has predominantly focused on enhancing the conversational capability, role-specific knowledge and style of these agents, there has been a noticeable gap in assessing their social intelligence. In this paper, we introduce SocialBench, the first benchmark designed to systematically evaluate the sociality of role-playing agents at both individual and group levels of social interactions. SocialBench is constructed from various sources and covers a wide range of 500 characters and over 6,000 question prompts and 30,800 multi-turn role-playing utterances. We conduct comprehensive evaluations on this benchmark using mainstream LLMs. We find that agents excelling in individual level does not imply their proficiency in group level. Experimental results on SocialBench confirm its significance as a testbed for assessing the social interaction of role-playing agents. The benchmark is publicly accessible at https://github.com/X-PLUG/RoleInteract.- Anthology ID:
- 2024.findings-acl.125
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2108–2126
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-acl.125/
- DOI:
- 10.18653/v1/2024.findings-acl.125
- Cite (ACL):
- Hongzhan Chen, Hehong Chen, Ming Yan, Wenshen Xu, Gao Xing, Weizhou Shen, Xiaojun Quan, Chenliang Li, Ji Zhang, and Fei Huang. 2024. SocialBench: Sociality Evaluation of Role-Playing Conversational Agents. In Findings of the Association for Computational Linguistics: ACL 2024, pages 2108–2126, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- SocialBench: Sociality Evaluation of Role-Playing Conversational Agents (Chen et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-acl.125.pdf