Psycholinguistic Profiles of Cognitive Distortions in Reddit Data

Neha Sharma, Navneet Agarwal, Kairit Sirts


Abstract
Cognitive distortions (CDs) are systematically biased patterns of thinking associated with the onset and maintenance of mental health conditions such as depression and anxiety. Computational research on CDs has primarily focused on detection and classification, while the linguistic characterization of distorted language; what psycholinguistic features distinguish distorted from non-distorted text, and whether individual distortion types carry distinct language patterns, remains largely unexplored. Using a Reddit dataset, we apply a Generalized Linear Model (GLM) with bootstrap sampling to LIWC-derived features and find that CD language is psycholinguistically distinct from non-distorted language. We further characterize type-specific psycholinguistic profiles for each CD, and through hierarchical clustering show that CD types are not fully separable, with certain distortions sharing stable linguistic signatures. Together, these findings contribute to the linguistic characterization of CDs, offering an empirically grounded account of the psycholinguistic properties that distinguish distorted language at the level of CDs as a whole and across specific distortion types.
Anthology ID:
2026.clpsych-1.25
Volume:
Proceedings of the 10th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Aya Zirikly, Kfir Bar, Sean MacAvaney, Molly Ireland, Yaakov Ophir, Dana Atzil-Slonim, Vasudha Varadarajan, Steven Bedrick, Bart Desmet
Venues:
CLPsych | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
306–323
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.clpsych-1.25/
DOI:
Bibkey:
Cite (ACL):
Neha Sharma, Navneet Agarwal, and Kairit Sirts. 2026. Psycholinguistic Profiles of Cognitive Distortions in Reddit Data. In Proceedings of the 10th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2026), pages 306–323, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Psycholinguistic Profiles of Cognitive Distortions in Reddit Data (Sharma et al., CLPsych 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.clpsych-1.25.pdf