Improved Topic Representations of Medical Documents to Assist COVID-19 Literature Exploration
Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, Simon Šuster
Abstract
Efficient discovery and exploration of biomedical literature has grown in importance in the context of the COVID-19 pandemic, and topic-based methods such as latent Dirichlet allocation (LDA) are a useful tool for this purpose. In this study we compare traditional topic models based on word tokens with topic models based on medical concepts, and propose several ways to improve topic coherence and specificity.- Anthology ID:
- 2020.nlpcovid19-2.12
- Volume:
- Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
- Month:
- December
- Year:
- 2020
- Address:
- Online
- Editors:
- Karin Verspoor, Kevin Bretonnel Cohen, Michael Conway, Berry de Bruijn, Mark Dredze, Rada Mihalcea, Byron Wallace
- Venue:
- NLP-COVID19
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- Language:
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2020.nlpcovid19-2.12/
- DOI:
- 10.18653/v1/2020.nlpcovid19-2.12
- Cite (ACL):
- Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, and Simon Šuster. 2020. Improved Topic Representations of Medical Documents to Assist COVID-19 Literature Exploration. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, Online. Association for Computational Linguistics.
- Cite (Informal):
- Improved Topic Representations of Medical Documents to Assist COVID-19 Literature Exploration (Otmakhova et al., NLP-COVID19 2020)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2020.nlpcovid19-2.12.pdf
- Data
- CORD-19