BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data

Wenkai Li; Jiarui Liu; Andy Liu; Xuhui Zhou; Mona Diab; Maarten Sap

BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data

Wenkai Li, Jiarui Liu, Andy Liu, Xuhui Zhou, Mona T. Diab, Maarten Sap

Abstract

In this work, we tackle the challenge of embedding realistic human personality traits into LLMs. Previous approaches have primarily focused on prompt-based methods that describe the behavior associated with the desired personality traits, suffering from realism and validity issues. To address these limitations, we introduce BIG5-CHAT, a large-scale dataset containing 100,000 dialogues designed to ground models in how humans express their personality in text. Leveraging this dataset, we explore Supervised Fine-Tuning and Direct Preference Optimization as training-based methods to align LLMs more naturally with human personality patterns. Our methods outperform prompting on personality assessments such as BFI and IPIP-NEO, with trait correlations more closely matching human data. Furthermore, our experiments reveal that models trained to exhibit higher conscientiousness, higher agreeableness, lower extraversion, and lower neuroticism display better performance on reasoning tasks, aligning with psychological findings on how these traits impact human cognitive performance. To our knowledge, this work is the first comprehensive study to demonstrate how training-based methods can shape LLM personalities through learning from real human behaviors.

Anthology ID:: 2025.acl-long.999
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 20434–20471
Language:
URL:: https://preview.aclanthology.org/landing_page/2025.acl-long.999/
DOI:
Bibkey:
Cite (ACL):: Wenkai Li, Jiarui Liu, Andy Liu, Xuhui Zhou, Mona T. Diab, and Maarten Sap. 2025. BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 20434–20471, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data (Li et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2025.acl-long.999.pdf

PDF Cite Search Fix data