Detection of Mental Health from Reddit via Deep Contextualized Representations
Zhengping Jiang, Sarah Ita Levitan, Jonathan Zomick, Julia Hirschberg
Abstract
We address the problem of automatic detection of psychiatric disorders from the linguistic content of social media posts. We build a large scale dataset of Reddit posts from users with eight disorders and a control user group. We extract and analyze linguistic characteristics of posts and identify differences between diagnostic groups. We build strong classification models based on deep contextualized word representations and show that they outperform previously applied statistical models with simple linguistic features by large margins. We compare user-level and post-level classification performance, as well as an ensembled multiclass model.- Anthology ID:
- 2020.louhi-1.16
- Volume:
- Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Eben Holderness, Antonio Jimeno Yepes, Alberto Lavelli, Anne-Lyse Minard, James Pustejovsky, Fabio Rinaldi
- Venue:
- Louhi
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 147–156
- Language:
- URL:
- https://aclanthology.org/2020.louhi-1.16
- DOI:
- 10.18653/v1/2020.louhi-1.16
- Cite (ACL):
- Zhengping Jiang, Sarah Ita Levitan, Jonathan Zomick, and Julia Hirschberg. 2020. Detection of Mental Health from Reddit via Deep Contextualized Representations. In Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis, pages 147–156, Online. Association for Computational Linguistics.
- Cite (Informal):
- Detection of Mental Health from Reddit via Deep Contextualized Representations (Jiang et al., Louhi 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2020.louhi-1.16.pdf