Deahan Yu
2026
NU_DeepHealthNLP at #SMM4H-HeaRD 2026: Entity-Conditioned Generation and a Four-Stage Pipeline for Automated SOAP Note Generation
Thanya Mysore Santhosh | Deahan Yu
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Thanya Mysore Santhosh | Deahan Yu
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
We describe two system submissions to Task 4 of the SMM4H-HeaRD 2026 Shared Task on automated SOAP note generation from doctor–patient dialogues. Our first submission is a standalone entity-conditioned generation model: Mistral-7B-Instruct-v0.1 fine-tuned with QLoRA on 8,529 MedSynth training dialogues, where both training and inference prompts include clinical entities extracted and grouped by SOAP section. Our second submission is a four-stage modular pipeline that additionally incorporates a hybrid retrieval stage and a rule-based verification stage. The key finding of this work is that incorporating structured clinical domain knowledge, in the form of NER entities grouped by SOAP section, directly into the generation prompt produces consistent and reliable improvements over dialogue-only generation. Our four-stage pipeline submission achieved an average score of 0.54 on the official test set, ranking first on the shared task leaderboard.
2020
Identifying Medication Abuse and Adverse Effects from Tweets: University of Michigan at #SMM4H 2020
V.G.Vinod Vydiswaran | Deahan Yu | Xinyan Zhao | Ermioni Carr | Jonathan Martindale | Jingcheng Xiao | Noha Ghannam | Matteo Althoen | Alexis Castellanos | Neel Patel | Daniel Vasquez
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task
V.G.Vinod Vydiswaran | Deahan Yu | Xinyan Zhao | Ermioni Carr | Jonathan Martindale | Jingcheng Xiao | Noha Ghannam | Matteo Althoen | Alexis Castellanos | Neel Patel | Daniel Vasquez
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task
The team from the University of Michigan participated in three tasks in the Social Media Mining for Health Applications (#SMM4H) 2020 shared tasks – on detecting mentions of adverse effects (Task 2), extracting and normalizing them (Task 3), and detecting mentions of medication abuse (Task 4). Our approaches relied on a combination of traditional machine learning and deep learning models. On Tasks 2 and 4, our submitted runs performed at or above the task average.
2019
Towards Text Processing Pipelines to Identify Adverse Drug Events-related Tweets: University of Michigan @ SMM4H 2019 Task 1
V.G.Vinod Vydiswaran | Grace Ganzel | Bryan Romas | Deahan Yu | Amy Austin | Neha Bhomia | Socheatha Chan | Stephanie Hall | Van Le | Aaron Miller | Olawunmi Oduyebo | Aulia Song | Radhika Sondhi | Danny Teng | Hao Tseng | Kim Vuong | Stephanie Zimmerman
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task
V.G.Vinod Vydiswaran | Grace Ganzel | Bryan Romas | Deahan Yu | Amy Austin | Neha Bhomia | Socheatha Chan | Stephanie Hall | Van Le | Aaron Miller | Olawunmi Oduyebo | Aulia Song | Radhika Sondhi | Danny Teng | Hao Tseng | Kim Vuong | Stephanie Zimmerman
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task
We participated in Task 1 of the Social Media Mining for Health Applications (SMM4H) 2019 Shared Tasks on detecting mentions of adverse drug events (ADEs) in tweets. Our approach relied on a text processing pipeline for tweets, and training traditional machine learning and deep learning models. Our submitted runs performed above average for the task.
Identifying Adverse Drug Events Mentions in Tweets Using Attentive, Collocated, and Aggregated Medical Representation
Xinyan Zhao | Deahan Yu | V.G.Vinod Vydiswaran
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task
Xinyan Zhao | Deahan Yu | V.G.Vinod Vydiswaran
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task
Identifying mentions of medical concepts in social media is challenging because of high variability in free text. In this paper, we propose a novel neural network architecture, the Collocated LSTM with Attentive Pooling and Aggregated representation (CLAPA), that integrates a bidirectional LSTM model with attention and pooling strategy and utilizes the collocation information from training data to improve the representation of medical concepts. The collocation and aggregation layers improve the model performance on the task of identifying mentions of adverse drug events (ADE) in tweets. Using the dataset made available as part of the workshop shared task, we show that careful selection of neighborhood contexts can help uncover useful local information and improve the overall medical concept representation.
Search
Fix author
Co-authors
- V. G. Vinod Vydiswaran 3
- Xinyan Zhao 2
- Matteo Althoen 1
- Amy Austin 1
- Neha Bhomia 1
- Ermioni Carr 1
- Alexis Castellanos 1
- Socheatha Chan 1
- Grace Ganzel 1
- Noha Ghannam 1
- Stephanie Hall 1
- Van Le 1
- Jonathan Martindale 1
- Aaron Miller 1
- Olawunmi Oduyebo 1
- Neel Patel 1
- Bryan Romas 1
- Thanya Mysore Santhosh 1
- Radhika Sondhi 1
- Aulia Song 1
- Danny Teng 1
- Hao Tseng 1
- Daniel Vasquez 1
- Kim Vuong 1
- Jingcheng Xiao 1
- Stephanie Zimmerman 1