Proceedings of the First Workshop on Natural Language Processing for Medical Conversations

Parminder Bhatia, Steven Lin, Rashmi Gangadharaiah, Byron Wallace, Izhak Shafran, Chaitanya Shivade, Nan Du, Mona Diab (Editors)

Anthology ID:
Association for Computational Linguistics
Bib Export formats:

pdf bib
Proceedings of the First Workshop on Natural Language Processing for Medical Conversations
Parminder Bhatia | Steven Lin | Rashmi Gangadharaiah | Byron Wallace | Izhak Shafran | Chaitanya Shivade | Nan Du | Mona Diab

pdf bib
Methods for Extracting Information from Messages from Primary Care Providers to Specialists
Xiyu Ding | Michael Barnett | Ateev Mehrotra | Timothy Miller

Electronic consult (eConsult) systems allow specialists more flexibility to respond to referrals more efficiently, thereby increasing access in under-resourced healthcare settings like safety net systems. Understanding the usage patterns of eConsult system is an important part of improving specialist efficiency. In this work, we develop and apply classifiers to a dataset of eConsult questions from primary care providers to specialists, classifying the messages for how they were triaged by the specialist office, and the underlying type of clinical question posed by the primary care provider. We show that pre-trained transformer models are strong baselines, with improving performance from domain-specific training and shared representations.

pdf bib
Towards Understanding ASR Error Correction for Medical Conversations
Anirudh Mani | Shruti Palaskar | Sandeep Konam

Domain Adaptation for Automatic Speech Recognition (ASR) error correction via machine translation is a useful technique for improving out-of-domain outputs of pre-trained ASR systems to obtain optimal results for specific in-domain tasks. We use this technique on our dataset of Doctor-Patient conversations using two off-the-shelf ASR systems: Google ASR (commercial) and the ASPIRE model (open-source). We train a Sequence-to-Sequence Machine Translation model and evaluate it on seven specific UMLS Semantic types, including Pharmacological Substance, Sign or Symptom, and Diagnostic Procedure to name a few. Lastly, we breakdown, analyze and discuss the 7% overall improvement in word error rate in view of each Semantic type.

Studying Challenges in Medical Conversation with Structured Annotation
Nan Wang | Yan Song | Fei Xia

Medical conversation is a central part of medical care. Yet, the current state and quality of medical conversation is far from perfect. Therefore, a substantial amount of research has been done to obtain a better understanding of medical conversation and to address its practical challenges and dilemmas. In line with this stream of research, we have developed a multi-layer structure annotation scheme to analyze medical conversation, and are using the scheme to construct a corpus of naturally occurring medical conversation in Chinese pediatric primary care setting. Some of the preliminary findings are reported regarding 1) how a medical conversation starts, 2) where communication problems tend to occur, and 3) how physicians close a conversation. Challenges and opportunities for research on medical conversation with NLP techniques will be discussed.

Generating Medical Reports from Patient-Doctor Conversations Using Sequence-to-Sequence Models
Seppo Enarvi | Marilisa Amoia | Miguel Del-Agua Teba | Brian Delaney | Frank Diehl | Stefan Hahn | Kristina Harris | Liam McGrath | Yue Pan | Joel Pinto | Luca Rubini | Miguel Ruiz | Gagandeep Singh | Fabian Stemmer | Weiyi Sun | Paul Vozila | Thomas Lin | Ranjani Ramamurthy

We discuss automatic creation of medical reports from ASR-generated patient-doctor conversational transcripts using an end-to-end neural summarization approach. We explore both recurrent neural network (RNN) and Transformer-based sequence-to-sequence architectures for summarizing medical conversations. We have incorporated enhancements to these architectures, such as the pointer-generator network that facilitates copying parts of the conversations to the reports, and a hierarchical RNN encoder that makes RNN training three times faster with long inputs. A comparison of the relative improvements from the different model architectures over an oracle extractive baseline is provided on a dataset of 800k orthopedic encounters. Consistent with observations in literature for machine translation and related tasks, we find the Transformer models outperform RNN in accuracy, while taking less than half the time to train. Significantly large wins over a strong oracle baseline indicate that sequence-to-sequence modeling is a promising approach for automatic generation of medical reports, in the presence of data at scale.

Towards an Ontology-based Medication Conversational Agent for PrEP and PEP
Muhammad Amith | Licong Cui | Kirk Roberts | Cui Tao

ABSTRACT: HIV (human immunodeficiency virus) can damage a human’s immune system and cause Acquired Immunodeficiency Syndrome (AIDS) which could lead to severe outcomes, including death. While HIV infections have decreased over the last decade, there is still a significant population where the infection permeates. PrEP and PEP are two proven preventive measures introduced that involve periodic dosage to stop the onset of HIV infection. However, the adherence rates for this medication is low in part due to the lack of information about the medication. There exist several communication barriers that prevent patient-provider communication from happening. In this work, we present our ontology-based method for automating the communication of this medication that can be deployed for live conversational agents for PrEP and PEP. This method facilitates a model of automated conversation between the machine and user can also answer relevant questions.

Heart Failure Education of African American and Hispanic/Latino Patients: Data Collection and Analysis
Itika Gupta | Barbara Di Eugenio | Devika Salunke | Andrew Boyd | Paula Allen-Meares | Carolyn Dickens | Olga Garcia

Heart failure is a global epidemic with debilitating effects. People with heart failure need to actively participate in home self-care regimens to maintain good health. However, these regimens are not as effective as they could be and are influenced by a variety of factors. Patients from minority communities like African American (AA) and Hispanic/Latino (H/L), often have poor outcomes compared to the average Caucasian population. In this paper, we lay the groundwork to develop an interactive dialogue agent that can assist AA and H/L patients in a culturally sensitive and linguistically accurate manner with their heart health care needs. This will be achieved by extracting relevant educational concepts from the interactions between health educators and patients. Thus far we have recorded and transcribed 20 such interactions. In this paper, we describe our data collection process, thematic and initiative analysis of the interactions, and outline our future steps.

On the Utility of Audiovisual Dialog Technologies and Signal Analytics for Real-time Remote Monitoring of Depression Biomarkers
Michael Neumann | Oliver Roessler | David Suendermann-Oeft | Vikram Ramanarayanan

We investigate the utility of audiovisual dialog systems combined with speech and video analytics for real-time remote monitoring of depression at scale in uncontrolled environment settings. We collected audiovisual conversational data from participants who interacted with a cloud-based multimodal dialog system, and automatically extracted a large set of speech and vision metrics based on the rich existing literature of laboratory studies. We report on the efficacy of various audio and video metrics in differentiating people with mild, moderate and severe depression, and discuss the implications of these results for the deployment of such technologies in real-world neurological diagnosis and monitoring applications.

Robust Prediction of Punctuation and Truecasing for Medical ASR
Monica Sunkara | Srikanth Ronanki | Kalpit Dixit | Sravan Bodapati | Katrin Kirchhoff

Automatic speech recognition (ASR) systems in the medical domain that focus on transcribing clinical dictations and doctor-patient conversations often pose many challenges due to the complexity of the domain. ASR output typically undergoes automatic punctuation to enable users to speak naturally, without having to vocalize awkward and explicit punctuation commands, such as “period”, “add comma” or “exclamation point”, while truecasing enhances user readability and improves the performance of downstream NLP tasks. This paper proposes a conditional joint modeling framework for prediction of punctuation and truecasing using pretrained masked language models such as BERT, BioBERT and RoBERTa. We also present techniques for domain and task specific adaptation by fine-tuning masked language models with medical domain data. Finally, we improve the robustness of the model against common errors made in ASR by performing data augmentation. Experiments performed on dictation and conversational style corpora show that our proposed model achieves 5% absolute improvement on ground truth text and 10% improvement on ASR outputs over baseline models under F1 metric.

Topic-Based Measures of Conversation for Detecting Mild CognitiveImpairment
Meysam Asgari | Liu Chen | Hiroko Dodge

Conversation is a complex cognitive task that engages multiple aspects of cognitive functions to remember the discussed topics, monitor the semantic and linguistic elements, and recognize others’ emotions. In this paper, we propose a computational method based on the lexical coherence of consecutive utterances to quantify topical variations in semi-structured conversations of older adults with cognitive impairments. Extracting the lexical knowledge of conversational utterances, our method generate a set of novel conversational measures that indicate underlying cognitive deficits among subjects with mild cognitive impairment (MCI). Our preliminary results verifies the utility of the proposed conversation-based measures in distinguishing MCI from healthy controls.