Mina Valizadeh


2024

pdf
CareCorpus: A Corpus of Real-World Solution-Focused Caregiver Strategies for Personalized Pediatric Rehabilitation Service Design
Mina Valizadeh | Vera C. Kaelin | Mary A. Khetani | Natalie Parde
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In pediatric rehabilitation services, one intervention approach involves using solution-focused caregiver strategies to support children in their daily life activities. The manual sharing of these strategies is not scalable, warranting need for an automated approach to recognize and select relevant strategies. We introduce CareCorpus, a dataset of 780 real-world strategies written by caregivers. Strategies underwent dual-annotation by three trained annotators according to four established rehabilitation classes (i.e., environment/context, n=325 strategies; a child’s sense of self, n=151 strategies; a child’s preferences, n=104 strategies; and a child’s activity competences, n=62 strategies) and a no-strategy class (n=138 instances) for irrelevant or indeterminate instances. The average percent agreement was 80.18%, with a Cohen’s Kappa of 0.75 across all classes. To validate this dataset, we propose multi-grained classification tasks for detecting and categorizing strategies, and establish new performance benchmarks ranging from F1=0.53-0.79. Our results provide a first step towards a smart option to sort caregiver strategies for use in designing pediatric rehabilitation care plans. This novel, interdisciplinary resource and application is also anticipated to generalize to other pediatric rehabilitation service contexts that target children with developmental need.

2023

pdf
What Clued the AI Doctor In? On the Influence of Data Source and Quality for Transformer-Based Medical Self-Disclosure Detection
Mina Valizadeh | Xing Qian | Pardis Ranjbar-Noiey | Cornelia Caragea | Natalie Parde
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Recognizing medical self-disclosure is important in many healthcare contexts, but it has been under-explored by the NLP community. We conduct a three-pronged investigation of this task. We (1) manually expand and refine the only existing medical self-disclosure corpus, resulting in a new, publicly available dataset of 3,919 social media posts with clinically validated labels and high compatibility with the existing task-specific protocol. We also (2) study the merits of pretraining task domain and text style by comparing Transformer-based models for this task, pretrained from general, medical, and social media sources. Our BERTweet condition outperforms the existing state of the art for this task by a relative F1 score increase of 16.73%. Finally, we (3) compare data augmentation techniques for this task, to assess the extent to which medical self-disclosure data may be further synthetically expanded. We discover that this task poses many challenges for data augmentation techniques, and we provide an in-depth analysis of identified trends.

2022

pdf
The AI Doctor Is In: A Survey of Task-Oriented Dialogue Systems for Healthcare Applications
Mina Valizadeh | Natalie Parde
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Task-oriented dialogue systems are increasingly prevalent in healthcare settings, and have been characterized by a diverse range of architectures and objectives. Although these systems have been surveyed in the medical community from a non-technical perspective, a systematic review from a rigorous computational perspective has to date remained noticeably absent. As a result, many important implementation details of healthcare-oriented dialogue systems remain limited or underspecified, slowing the pace of innovation in this area. To fill this gap, we investigated an initial pool of 4070 papers from well-known computer science, natural language processing, and artificial intelligence venues, identifying 70 papers discussing the system-level implementation of task-oriented dialogue systems for healthcare applications. We conducted a comprehensive technical review of these papers, and present our key findings including identified gaps and corresponding recommendations.

2021

pdf
Identifying Medical Self-Disclosure in Online Communities
Mina Valizadeh | Pardis Ranjbar-Noiey | Cornelia Caragea | Natalie Parde
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Self-disclosure in online health conversations may offer a host of benefits, including earlier detection and treatment of medical issues that may have otherwise gone unaddressed. However, research analyzing medical self-disclosure in online communities is limited. We address this shortcoming by introducing a new dataset of health-related posts collected from online social platforms, categorized into three groups (No Self-Disclosure, Possible Self-Disclosure, and Clear Self-Disclosure) with high inter-annotator agreement (_k_=0.88). We make this data available to the research community. We also release a predictive model trained on this dataset that achieves an accuracy of 81.02%, establishing a strong performance benchmark for this task.

2020

pdf
Modeling Dialogue in Conversational Cognitive Health Screening Interviews
Shahla Farzana | Mina Valizadeh | Natalie Parde
Proceedings of the Twelfth Language Resources and Evaluation Conference

Automating straightforward clinical tasks can reduce workload for healthcare professionals, increase accessibility for geographically-isolated patients, and alleviate some of the economic burdens associated with healthcare. A variety of preliminary screening procedures are potentially suitable for automation, and one such domain that has remained underexplored to date is that of structured clinical interviews. A task-specific dialogue agent is needed to automate the collection of conversational speech for further (either manual or automated) analysis, and to build such an agent, a dialogue manager must be trained to respond to patient utterances in a manner similar to a human interviewer. To facilitate the development of such an agent, we propose an annotation schema for assigning dialogue act labels to utterances in patient-interviewer conversations collected as part of a clinically-validated cognitive health screening task. We build a labeled corpus using the schema, and show that it is characterized by high inter-annotator agreement. We establish a benchmark dialogue act classification model for the corpus, thereby providing a proof of concept for the proposed annotation schema. The resulting dialogue act corpus is the first such corpus specifically designed to facilitate automated cognitive health screening, and lays the groundwork for future exploration in this area.