Decoding LLM Personality Measurement: Forced-Choice vs. Likert

Xiaoyu Li; Haoran Shi; Zengyi Yu; Yukun Tu; Chanjin Zheng

doi:10.18653/v1/2025.findings-acl.480

Decoding LLM Personality Measurement: Forced-Choice vs. Likert

Xiaoyu Li, Haoran Shi, Zengyi Yu, Yukun Tu, Chanjin Zheng

Abstract

Recent research has focused on investigating the psychological characteristics of Large Language Models (LLMs), emphasizing the importance of comprehending their behavioral traits. Likert scale personality questionnaires have become the primary tool for assessing these characteristics in LLMs. However, such scales can be skewed by factors such as social desirability, distorting the assessment of true personality traits. To address this issue, we firstly incorporate the forced-choice test, a method known for reducing response bias in human personality assessments, into the evaluation of LLM. Specifically, we evaluated six LLMs: Llama-3.1-8B, GLM-4-9B, GPT-3.5-turbo, GPT-4o, Claude-3.5-sonnet, and Deepseek-V3. We compared the Likert scale and forced-choice test results for LLMs’ Big Five personality scores, as well as their reliability. In addition, we looked at how temperature parameter and language affected LLM personality scores. The results show that the forced-choice test better captures differences between LLMs across various personality dimensions and is less influenced by temperature parameters. Furthermore, we found both broad trends and specific variations in personality scores across models and languages.

Anthology ID:: 2025.findings-acl.480
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9234–9247
Language:
URL:: https://preview.aclanthology.org/mtsummit-25-ingestion/2025.findings-acl.480/
DOI:: 10.18653/v1/2025.findings-acl.480
Bibkey:
Cite (ACL):: Xiaoyu Li, Haoran Shi, Zengyi Yu, Yukun Tu, and Chanjin Zheng. 2025. Decoding LLM Personality Measurement: Forced-Choice vs. Likert. In Findings of the Association for Computational Linguistics: ACL 2025, pages 9234–9247, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Decoding LLM Personality Measurement: Forced-Choice vs. Likert (Li et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/mtsummit-25-ingestion/2025.findings-acl.480.pdf

PDF Cite Search Fix data