Leveraging Medical Literature for Section Prediction in Electronic Health Records

Sara Rosenthal, Ken Barker, Zhicheng Liang


Abstract
Electronic Health Records (EHRs) contain both structured content and unstructured (text) content about a patient’s medical history. In the unstructured text parts, there are common sections such as Assessment and Plan, Social History, and Medications. These sections help physicians find information easily and can be used by an information retrieval system to return specific information sought by a user. However, it is common that the exact format of sections in a particular EHR does not adhere to known patterns. Therefore, being able to predict sections and headers in EHRs automatically is beneficial to physicians. Prior approaches in EHR section prediction have only used text data from EHRs and have required significant manual annotation. We propose using sections from medical literature (e.g., textbooks, journals, web content) that contain content similar to that found in EHR sections. Our approach uses data from a different kind of source where labels are provided without the need of a time-consuming annotation effort. We use this data to train two models: an RNN and a BERT-based model. We apply the learned models along with source data via transfer learning to predict sections in EHRs. Our results show that medical literature can provide helpful supervision signal for this classification task.
Anthology ID:
D19-1492
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4864–4873
Language:
URL:
https://aclanthology.org/D19-1492
DOI:
10.18653/v1/D19-1492
Bibkey:
Cite (ACL):
Sara Rosenthal, Ken Barker, and Zhicheng Liang. 2019. Leveraging Medical Literature for Section Prediction in Electronic Health Records. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4864–4873, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Leveraging Medical Literature for Section Prediction in Electronic Health Records (Rosenthal et al., EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/D19-1492.pdf