Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?

Prateek Kumar Rajput; Yewei Song; Iyiola Emmanuel Olatunji; Jacques Klein; Tegawendé Bissyandé

Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?

Prateek Kumar Rajput, Yewei Song, Iyiola Emmanuel Olatunji, Jacques Klein, Tegawendé Bissyande

Abstract

Can large language models reliably express a human-like personality, or are they merely mimicking surface cues without a stable underlying profile? We study this question on the long-form Essays Dataset, preferred over short, mood-driven text to target stable traits. Using a questionnaire-based (self-evaluation) test: IPIP-NEO, we ask: (i) does post-training (SFT, DPO, ORPO) stabilize questionnaire scores under prompt rephrasings, and (ii) can it induce target Big Five profiles from unguided essays? Across five models, fine-tuning consistently reduces variance in questionnaire responses, mitigating the fragility seen in pre-trained models. Yet accuracy on the full five-dimensional profile remains near chance even when single-trait scores improve, indicating that unguided essays lack the cues needed for faithful personality expression. We argue for scenario-grounded datasets or interactive elicitation that accumulates test-aligned evidence over time.

Anthology ID:: 2026.lrec-main.881
Volume:: Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:: May
Year:: 2026
Address:: Palma de Mallorca, Spain
Editors:: Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:: LREC
SIG:
Publisher:: ELRA Language Resource Association
Note:
Pages:: 11272–11285
Language:
URL:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.881/
DOI:
Bibkey:
Cite (ACL):: Prateek Kumar Rajput, Yewei Song, Iyiola Emmanuel Olatunji, Jacques Klein, and Tegawendé Bissyande. 2026. Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?. International Conference on Language Resources and Evaluation, main:11272–11285.
Cite (Informal):: Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost? (Rajput et al., LREC 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.881.pdf

PDF Cite Search Fix data