PersonalityDBench: A Dataset for Personality Disorders - from Modeling to Controlled Generation
Federico Ravenda, Seyed Ali Bahrainian, Daniele Montagnani, Antonietta Mira, Andrea Raballo
Abstract
Personality disorders (PDs) are a complex class of mental health (MH) conditions characterized by persistent patterns of cognition, behavior, and emotional regulation that deviate from cultural norms. While social media has become a valuable resource for MH research, NLP has largely focused on more prevalent conditions (e.g., depression), leaving PDs underexplored. In this work, we introduce PersonalityDBench, a large-scale, clinically grounded dataset that supports multidimensional study of personality pathology, and standardized, reproducible evaluation of LLM steering toward clinically grounded behavioral targets. The dataset comprises two parts: (1) PRISMA and (2) PersonaDSteering. (1) PRISMA (PeRsonality dISorder MAnifestations) is a clinically annotated collection of social media content spanning the full spectrum of PDs. It links clinically validated diagnostic criteria and dimensional trait frameworks with computational annotation and analysis methods to support fine-grained, multidimensional study of how PDs manifests in naturalistic, free-form language. Building on PRISMA, (2) PersonaDSteering is a benchmark for LLM steering evaluation that operationalizes clinically grounded PD profiles into structured behavioral elicitation tasks, enabling multidimensional steerability assessment beyond single-behavior settings and supporting PD-consistent persona construction for simulated patient generation. This dataset may have application in the study and modeling of PD and powering personality-specific text generation for adaptive, personalized chat systems.- Anthology ID:
- 2026.acl-long.1395
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 30239–30259
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1395/
- DOI:
- Cite (ACL):
- Federico Ravenda, Seyed Ali Bahrainian, Daniele Montagnani, Antonietta Mira, and Andrea Raballo. 2026. PersonalityDBench: A Dataset for Personality Disorders - from Modeling to Controlled Generation. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 30239–30259, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- PersonalityDBench: A Dataset for Personality Disorders - from Modeling to Controlled Generation (Ravenda et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1395.pdf