Neha Sharma

2026

Psycholinguistic Profiles of Cognitive Distortions in Reddit Data
Neha Sharma | Navneet Agarwal | Kairit Sirts
Proceedings of the 10th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2026)

Cognitive distortions (CDs) are systematically biased patterns of thinking associated with the onset and maintenance of mental health conditions such as depression and anxiety. Computational research on CDs has primarily focused on detection and classification, while the linguistic characterization of distorted language; what psycholinguistic features distinguish distorted from non-distorted text, and whether individual distortion types carry distinct language patterns, remains largely unexplored. Using a Reddit dataset, we apply a Generalized Linear Model (GLM) with bootstrap sampling to LIWC-derived features and find that CD language is psycholinguistically distinct from non-distorted language. We further characterize type-specific psycholinguistic profiles for each CD, and through hierarchical clustering show that CD types are not fully separable, with certain distortions sharing stable linguistic signatures. Together, these findings contribute to the linguistic characterization of CDs, offering an empirically grounded account of the psycholinguistic properties that distinguish distorted language at the level of CDs as a whole and across specific distortion types.

2024

pdf bib abs

Context is Important in Depressive Language: A Study of the Interaction Between the Sentiments and Linguistic Markers in Reddit Discussions
Neha Sharma | Kairit Sirts
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

Research exploring linguistic markers in individuals with depression has demonstrated that language usage can serve as an indicator of mental health. This study investigates the impact of discussion topic as context on linguistic markers and emotional expression in depression, using a Reddit dataset to explore interaction effects. Contrary to common findings, our sentiment analysis revealed a broader range of emotional intensity in depressed individuals, with both higher negative and positive sentiments than controls. This pattern was driven by posts containing no emotion words, revealing the limitations of the lexicon based approaches in capturing the full emotional context. We observed several interesting results demonstrating the importance of contextual analyses. For instance, the use of 1st person singular pronouns and words related to anger and sadness correlated with increased positive sentiments, whereas a higher rate of present-focused words was associated with more negative sentiments. Our findings highlight the importance of discussion contexts while interpreting the language used in depression, revealing that the emotional intensity and meaning of linguistic markers can vary based on the topic of discussion.

Co-authors

Kairit Sirts 2
Navneet Agarwal 1

Venues

Fix author