CorEDs: A Corpus on Eating Disorders

Melissa Donati, Carlo Strapparava


Abstract
Eating disorders (EDs) constitute a widespread group of mental illnesses affecting the everyday life of many individuals in all age groups. One of the main difficulties in the diagnosis and treatment of these disorders is the interpersonal variability of symptoms and the variety of underlying psychological states that are not considered in traditional approaches. In order to gain a better understanding of these disorders, many studies have collected data from social media and analysed them from a computational perspective, but the resulting dataset were very limited and task-specific. Aiming to address this shortage by providing a dataset that could be easily adapted to different tasks, we built a corpus collecting ED-related and ED-unrelated comments from Reddit focusing on a limited number of topics (fitness, nutrition, etc.). To validate the effectiveness of the dataset, we evaluated the performance of two classifiers in distinguishing between ED-related and unrelated comments. The high-level accuracy of both classifiers indicates that ED-related texts are separable from texts on similar topics that do not address EDs. For explorative purposes, we also carried out a linguistic analysis of word class dominance in ED-related texts, whose results are consistent with the findings of psychological research on EDs.
Anthology ID:
2022.rapid-1.10
Volume:
Proceedings of the RaPID Workshop - Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments - within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Dimitrios Kokkinakis, Charalambos K. Themistocleous, Kristina Lundholm Fors, Athanasios Tsanas, Kathleen C. Fraser
Venue:
RaPID
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
80–85
Language:
URL:
https://aclanthology.org/2022.rapid-1.10
DOI:
Bibkey:
Cite (ACL):
Melissa Donati and Carlo Strapparava. 2022. CorEDs: A Corpus on Eating Disorders. In Proceedings of the RaPID Workshop - Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments - within the 13th Language Resources and Evaluation Conference, pages 80–85, Marseille, France. European Language Resources Association.
Cite (Informal):
CorEDs: A Corpus on Eating Disorders (Donati & Strapparava, RaPID 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2022.rapid-1.10.pdf