Colin Depp


Towards Intelligent Clinically-Informed Language Analyses of People with Bipolar Disorder and Schizophrenia
Ankit Aich | Avery Quynh | Varsha Badal | Amy Pinkham | Philip Harvey | Colin Depp | Natalie Parde
Findings of the Association for Computational Linguistics: EMNLP 2022

NLP offers a myriad of opportunities to support mental health research. However, prior work has almost exclusively focused on social media data, for which diagnoses are difficult or impossible to validate. We present a first-of-its-kind dataset of manually transcribed interactions with people clinically diagnosed with bipolar disorder and schizophrenia, as well as healthy controls. Data was collected through validated clinical tasks and paired with diagnostic measures. We extract 100+ temporal, sentiment, psycholinguistic, emotion, and lexical features from the data and establish classification validity using a variety of models to study language differences between diagnostic groups. Our models achieve strong classification performance (maximum F1=0.93-0.96), and lead to the discovery of interesting associations between linguistic features and diagnostic class. It is our hope that this dataset will offer high value to clinical and NLP researchers, with potential for widespread broader impacts.