Social Welfare Function Leaderboard: On the Emergence of LLM Agents as the Welfare Dictator

Zhengliang Shi; Ruotian Ma; Jen-tse Huang; Xinbei Ma; Xingyu Chen; Mengru Wang; Qu Yang; Yue Wang; Fanghua Ye; Ziyang Chen; Shanyi Wang; Cixing LI; Wenxuan Wang; Zhaopeng Tu; Xiaolong Li; Zhaochun Ren; Liefeng Bo

Social Welfare Function Leaderboard: On the Emergence of LLM Agents as the Welfare Dictator

Zhengliang Shi, Ruotian Ma, Jen-tse Huang, Xinbei Ma, Xingyu Chen, Mengru Wang, Qu Yang, Yue Wang, Fanghua Ye, Ziyang Chen, Shanyi Wang, Cixing LI, Wenxuan Wang, Zhaopeng Tu, Xiaolong Li, Zhaochun Ren, Liefeng Bo

Abstract

Large language models (LLMs) are increasingly entrusted with high-stakes decisions that affect human welfare. However, the principles and values that guide these models when distributing scarce societal resources remain largely unexamined. To address this, we introduce the Social Welfare Function (SWF) Benchmark, a dynamic simulation environment in which an LLM acts as a dictator, distributing tasks to heterogeneous recipients with different returns on investment (ROI). The benchmark is designed to create a dilemma between maximizing collective efficiency (i.e., overall ROI) and ensuring distributive fairness (measured by the Gini coefficient). We evaluate 20 state-of-the-art LLMs. Our findings reveal several key insights, including: (i) LLMs’ general ability, as measured by popular Arena leaderboards, misaligns with their allocation skills; (ii) Most LLMs exhibit a strong default utilitarian orientation, prioritizing overall productivity at the expense of inequality. (iii) Allocation behaviors are highly manipulated, easily perturbed by common persuasion strategies. These results highlight the risks of deploying current LLMs as societal decision-makers and underscore the need for specialized benchmarks and alignment for AI governance.

Anthology ID:: 2026.findings-acl.1919
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 38530–38551
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1919/
DOI:
Bibkey:
Cite (ACL):: Zhengliang Shi, Ruotian Ma, Jen-tse Huang, Xinbei Ma, Xingyu Chen, Mengru Wang, Qu Yang, Yue Wang, Fanghua Ye, Ziyang Chen, Shanyi Wang, Cixing LI, Wenxuan Wang, Zhaopeng Tu, Xiaolong Li, Zhaochun Ren, and Liefeng Bo. 2026. Social Welfare Function Leaderboard: On the Emergence of LLM Agents as the Welfare Dictator. In Findings of the Association for Computational Linguistics: ACL 2026, pages 38530–38551, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Social Welfare Function Leaderboard: On the Emergence of LLM Agents as the Welfare Dictator (Shi et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1919.pdf
Checklist:: 2026.findings-acl.1919.checklist.pdf

PDF Cite Search Checklist Fix data