Identification of Risk Factors in Clinical Texts through Association Rules
Svetla Boytcheva, Ivelina Nikolova, Galia Angelova, Zhivko Angelov
Abstract
We describe a method which extracts Association Rules from texts in order to recognise verbalisations of risk factors. Usually some basic vocabulary about risk factors is known but medical conditions are expressed in clinical narratives with much higher variety. We propose an approach for data-driven learning of specialised medical vocabulary which, once collected, enables early alerting of potentially affected patients. The method is illustrated by experimens with clinical records of patients with Chronic Obstructive Pulmonary Disease (COPD) and comorbidity of CORD, Diabetes Melitus and Schizophrenia. Our input data come from the Bulgarian Diabetic Register, which is built using a pseudonymised collection of outpatient records for about 500,000 diabetic patients. The generated Association Rules for CORD are analysed in the context of demographic, gender, and age information. Valuable anounts of meaningful words, signalling risk factors, are discovered with high precision and confidence.- Anthology ID:
- W17-8009
- Volume:
- Proceedings of the Biomedical NLP Workshop associated with RANLP 2017
- Month:
- September
- Year:
- 2017
- Address:
- Varna, Bulgaria
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 64–72
- Language:
- URL:
- https://doi.org/10.26615/978-954-452-044-1_009
- DOI:
- 10.26615/978-954-452-044-1_009
- Cite (ACL):
- Svetla Boytcheva, Ivelina Nikolova, Galia Angelova, and Zhivko Angelov. 2017. Identification of Risk Factors in Clinical Texts through Association Rules. In Proceedings of the Biomedical NLP Workshop associated with RANLP 2017, pages 64–72, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Identification of Risk Factors in Clinical Texts through Association Rules (Boytcheva et al., RANLP 2017)
- PDF:
- https://doi.org/10.26615/978-954-452-044-1_009