A Corpus for Outbreak Detection of Diseases Prevalent in Latin America

Antonella Dellanzo, Viviana Cotik, Jose Ochoa-Luna


Abstract
In this paper we present an annotated corpus which can be used for training and testing algorithms to automatically extract information about diseases outbreaks from news and health reports. We also propose initial approaches to extract information from it. The corpus has been constructed with two main tasks in mind. The first one, to extract entities about outbreaks such as disease, host, location among others. The second one, to retrieve relations among entities, for instance, in such geographic location fifteen cases of a given disease were reported. Overall, our goal is to offer resources and tools to perform an automated analysis so as to support early detection of disease outbreaks and therefore diminish their spreading.
Anthology ID:
2020.conll-1.44
Volume:
Proceedings of the 24th Conference on Computational Natural Language Learning
Month:
November
Year:
2020
Address:
Online
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
543–551
Language:
URL:
https://aclanthology.org/2020.conll-1.44
DOI:
10.18653/v1/2020.conll-1.44
Bibkey:
Cite (ACL):
Antonella Dellanzo, Viviana Cotik, and Jose Ochoa-Luna. 2020. A Corpus for Outbreak Detection of Diseases Prevalent in Latin America. In Proceedings of the 24th Conference on Computational Natural Language Learning, pages 543–551, Online. Association for Computational Linguistics.
Cite (Informal):
A Corpus for Outbreak Detection of Diseases Prevalent in Latin America (Dellanzo et al., CoNLL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.conll-1.44.pdf