GLoHBCD: A Naturalistic German Dataset for Language of Health Behaviour Change on Online Support Forums

Selina Meyer, David Elsweiler


Abstract
Health behaviour change is a difficult and prolonged process that requires sustained motivation and determination. Conversa- tional agents have shown promise in supporting the change process in the past. One therapy approach that facilitates change and has been used as a framework for conversational agents is motivational interviewing. However, existing implementations of this therapy approach lack the deep understanding of user utterances that is essential to the spirit of motivational interviewing. To address this lack of understanding, we introduce the GLoHBCD, a German dataset of naturalistic language around health behaviour change. Data was sourced from a popular German weight loss forum and annotated using theoretically grounded motivational interviewing categories. We describe the process of dataset construction and present evaluation results. Initial experiments suggest a potential for broad applicability of the data and the resulting classifiers across different behaviour change domains. We make code to replicate the dataset and experiments available on Github.
Anthology ID:
2022.lrec-1.239
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2226–2235
Language:
URL:
https://aclanthology.org/2022.lrec-1.239
DOI:
Bibkey:
Cite (ACL):
Selina Meyer and David Elsweiler. 2022. GLoHBCD: A Naturalistic German Dataset for Language of Health Behaviour Change on Online Support Forums. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2226–2235, Marseille, France. European Language Resources Association.
Cite (Informal):
GLoHBCD: A Naturalistic German Dataset for Language of Health Behaviour Change on Online Support Forums (Meyer & Elsweiler, LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2022.lrec-1.239.pdf
Code
 selinameyer/glohbcd