Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark

Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, David Jurgens


Abstract
Large language models (LLMs) have been shown to perform well at a variety of syntactic, discourse, and reasoning tasks. While LLMs are increasingly deployed in many forms including conversational agents that interact with humans, we lack a grounded benchmark to measure how well LLMs understand social language. Here, we introduce a new theory-driven benchmark, SocKET, that contains 58 NLP tasks testing social knowledge which we group into five categories: humor & sarcasm, offensiveness, sentiment & emotion, and trustworthiness. In tests on the benchmark, we demonstrate that current models attain only moderate performance but reveal significant potential for task transfer among different types and categories of tasks, which were predicted from theory. Through zero-shot evaluations, we show that pretrained models already possess some innate but limited capabilities of social language understanding and training on one category of tasks can improve zero-shot testing on others. Our benchmark provides a systematic way to analyze model performance on an important dimension of language and points to clear room for improvement to build more socially-aware LLMs. The resources are released at https://github.com/minjechoi/SOCKET.
Anthology ID:
2023.emnlp-main.699
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11370–11403
Language:
URL:
https://aclanthology.org/2023.emnlp-main.699
DOI:
10.18653/v1/2023.emnlp-main.699
Bibkey:
Cite (ACL):
Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, and David Jurgens. 2023. Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 11370–11403, Singapore. Association for Computational Linguistics.
Cite (Informal):
Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark (Choi et al., EMNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2023.emnlp-main.699.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-4/2023.emnlp-main.699.mp4