Abstract
A significant limitation in developing therapeutic chatbots to support people going through psychological distress is the lack of high-quality, large-scale datasets capturing conversations between clients and trained counselors. As a remedy, researchers have focused their attention on scraping conversational data from peer support platforms such as Reddit. But the extent to which the responses from peers align with responses from trained counselors is understudied. We address this gap by analyzing the differences between responses from counselors and peers by getting trained counselors to annotate ≈17K such responses using Motivational Interviewing Treatment Integrity (MITI) code, a well-established behavioral coding system that differentiates between favorable and unfavorable responses. We developed an annotation pipeline with several stages of quality control. Due to its design, this method was able to achieve 97% of coverage, meaning that out of the 17.3K responses we successfully labeled 16.8K with a moderate agreement. We use this data to conclude the extent to which conversational data from peer support platforms align with real therapeutic conversations and discuss in what ways they can be exploited to train therapeutic chatbots.- Anthology ID:
- 2022.coling-1.293
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 3315–3330
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.293
- DOI:
- Cite (ACL):
- Anuradha Welivita and Pearl Pu. 2022. Curating a Large-Scale Motivational Interviewing Dataset Using Peer Support Forums. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3315–3330, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- Curating a Large-Scale Motivational Interviewing Dataset Using Peer Support Forums (Welivita & Pu, COLING 2022)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2022.coling-1.293.pdf
- Code
- anuradha1992/motivational-interviewing-dataset