Qualitative Analysis of Depression Models by Demographics

Carlos Aguirre, Mark Dredze


Abstract
Models for identifying depression using social media text exhibit biases towards different gender and racial/ethnic groups. Factors like representation and balance of groups within the dataset are contributory factors, but difference in content and social media use may further explain these biases. We present an analysis of the content of social media posts from different demographic groups. Our analysis shows that there are content differences between depression and control subgroups across demographic groups, and that temporal topics and demographic-specific topics are correlated with downstream depression model error. We discuss the implications of our work on creating future datasets, as well as designing and training models for mental health.
Anthology ID:
2021.clpsych-1.19
Volume:
Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access
Month:
June
Year:
2021
Address:
Online
Venue:
CLPsych
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
169–180
Language:
URL:
https://aclanthology.org/2021.clpsych-1.19
DOI:
10.18653/v1/2021.clpsych-1.19
Bibkey:
Cite (ACL):
Carlos Aguirre and Mark Dredze. 2021. Qualitative Analysis of Depression Models by Demographics. In Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access, pages 169–180, Online. Association for Computational Linguistics.
Cite (Informal):
Qualitative Analysis of Depression Models by Demographics (Aguirre & Dredze, CLPsych 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.clpsych-1.19.pdf