Learning to Ask Like a Physician
Eric Lehman, Vladislav Lialin, Katelyn Edelwina Legaspi, Anne Janelle Sy, Patricia Therese Pile, Nicole Rose Alberto, Richard Raymund Ragasa, Corinna Victoria Puyat, Marianne Katharina Taliño, Isabelle Rose Alberto, Pia Gabrielle Alfonso, Dana Moukheiber, Byron Wallace, Anna Rumshisky, Jennifer Liang, Preethi Raghavan, Leo Anthony Celi, Peter Szolovits
Abstract
Existing question answering (QA) datasets derived from electronic health records (EHR) are artificially generated and consequently fail to capture realistic physician information needs. We present Discharge Summary Clinical Questions (DiSCQ), a newly curated question dataset composed of 2,000+ questions paired with the snippets of text (triggers) that prompted each question. The questions are generated by medical experts from 100+ MIMIC-III discharge summaries. We analyze this dataset to characterize the types of information sought by medical experts. We also train baseline models for trigger detection and question generation (QG), paired with unsupervised answer retrieval over EHRs. Our baseline model is able to generate high quality questions in over 62% of cases when prompted with human selected triggers. We release this dataset (and all code to reproduce baseline model results) to facilitate further research into realistic clinical QA and QG: https://github.com/elehman16/discq.- Anthology ID:
- 2022.clinicalnlp-1.8
- Volume:
- Proceedings of the 4th Clinical Natural Language Processing Workshop
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, WA
- Editors:
- Tristan Naumann, Steven Bethard, Kirk Roberts, Anna Rumshisky
- Venue:
- ClinicalNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 74–86
- Language:
- URL:
- https://aclanthology.org/2022.clinicalnlp-1.8
- DOI:
- 10.18653/v1/2022.clinicalnlp-1.8
- Cite (ACL):
- Eric Lehman, Vladislav Lialin, Katelyn Edelwina Legaspi, Anne Janelle Sy, Patricia Therese Pile, Nicole Rose Alberto, Richard Raymund Ragasa, Corinna Victoria Puyat, Marianne Katharina Taliño, Isabelle Rose Alberto, Pia Gabrielle Alfonso, Dana Moukheiber, Byron Wallace, Anna Rumshisky, Jennifer Liang, Preethi Raghavan, Leo Anthony Celi, and Peter Szolovits. 2022. Learning to Ask Like a Physician. In Proceedings of the 4th Clinical Natural Language Processing Workshop, pages 74–86, Seattle, WA. Association for Computational Linguistics.
- Cite (Informal):
- Learning to Ask Like a Physician (Lehman et al., ClinicalNLP 2022)
- PDF:
- https://preview.aclanthology.org/landing_page/2022.clinicalnlp-1.8.pdf
- Code
- elehman16/discq
- Data
- DiSCQ, emrQA