Abstract
This paper presents a set of classification experiments for identifying depression in posts gathered from social media platforms. In addition to the data gathered previously by other researchers, we collect additional data from the social media platform Reddit. Our experiments show promising results for identifying depression from social media texts. More importantly, however, we show that the choice of corpora is crucial in identifying depression and can lead to misleading conclusions in case of poor choice of data.- Anthology ID:
- W18-5903
- Volume:
- Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task
- Month:
- October
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Graciela Gonzalez-Hernandez, Davy Weissenbacher, Abeed Sarker, Michael Paul
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9–12
- Language:
- URL:
- https://aclanthology.org/W18-5903
- DOI:
- 10.18653/v1/W18-5903
- Cite (ACL):
- Inna Pirina and Çağrı Çöltekin. 2018. Identifying Depression on Reddit: The Effect of Training Data. In Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task, pages 9–12, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Identifying Depression on Reddit: The Effect of Training Data (Pirina & Çöltekin, EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/naacl24-info/W18-5903.pdf