Knowledge Discovery in COVID-19 Research Literature

Ernesto L. Estevanell-Valladares, Alejandro Piad-Morffis, Suilan Estevez-Velarde, Yoan Gutierrez, Andres Montoyo, Rafael Muñoz, Yudivián Almeida-Cruz


Abstract
This paper presents the preliminary results of an ongoing project that analyzes the growing body of scientific research published around the COVID-19 pandemic. In this research, a general-purpose semantic model is used to double annotate a batch of 500 sentences that were manually selected from the CORD-19 corpus. Afterwards, a baseline text-mining pipeline is designed and evaluated via a large batch of 100,959 sentences. We present a qualitative analysis of the most interesting facts automatically extracted and highlight possible future lines of development. The preliminary results show that general-purpose semantic models are a useful tool for discovering fine-grained knowledge in large corpora of scientific documents.
Anthology ID:
2021.ranlp-1.46
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
402–410
Language:
URL:
https://preview.aclanthology.org/ingest-swisstext/2021.ranlp-1.46/
DOI:
Bibkey:
Cite (ACL):
Ernesto L. Estevanell-Valladares, Alejandro Piad-Morffis, Suilan Estevez-Velarde, Yoan Gutierrez, Andres Montoyo, Rafael Muñoz, and Yudivián Almeida-Cruz. 2021. Knowledge Discovery in COVID-19 Research Literature. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 402–410, Held Online. INCOMA Ltd..
Cite (Informal):
Knowledge Discovery in COVID-19 Research Literature (Estevanell-Valladares et al., RANLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-swisstext/2021.ranlp-1.46.pdf