Curating a Large-Scale Motivational Interviewing Dataset Using Peer Support Forums

Anuradha Welivita, Pearl Pu


Abstract
A significant limitation in developing therapeutic chatbots to support people going through psychological distress is the lack of high-quality, large-scale datasets capturing conversations between clients and trained counselors. As a remedy, researchers have focused their attention on scraping conversational data from peer support platforms such as Reddit. But the extent to which the responses from peers align with responses from trained counselors is understudied. We address this gap by analyzing the differences between responses from counselors and peers by getting trained counselors to annotate ≈17K such responses using Motivational Interviewing Treatment Integrity (MITI) code, a well-established behavioral coding system that differentiates between favorable and unfavorable responses. We developed an annotation pipeline with several stages of quality control. Due to its design, this method was able to achieve 97% of coverage, meaning that out of the 17.3K responses we successfully labeled 16.8K with a moderate agreement. We use this data to conclude the extent to which conversational data from peer support platforms align with real therapeutic conversations and discuss in what ways they can be exploited to train therapeutic chatbots.
Anthology ID:
2022.coling-1.293
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
3315–3330
Language:
URL:
https://aclanthology.org/2022.coling-1.293
DOI:
Bibkey:
Cite (ACL):
Anuradha Welivita and Pearl Pu. 2022. Curating a Large-Scale Motivational Interviewing Dataset Using Peer Support Forums. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3315–3330, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Curating a Large-Scale Motivational Interviewing Dataset Using Peer Support Forums (Welivita & Pu, COLING 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2022.coling-1.293.pdf
Code
 anuradha1992/motivational-interviewing-dataset