Abstract
Named Entity Recognition (NER) is an important component of natural language processing (NLP), with applicability in biomedical domain, enabling knowledge-discovery from medical texts. Due to the fact that for the Romanian language there are only a few linguistic resources specific to the biomedical domain, it was created a sub-corpus specific to this domain. In this paper we present a newly developed Romanian sub-corpus for medical-domain NER, which is a valuable asset for the field of biomedical text processing. We provide a description of the sub-corpus, informative statistics about data-composition and we evaluate an automatic NER tool on the newly created resource.- Anthology ID:
- R17-1066
- Volume:
- Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
- Month:
- September
- Year:
- 2017
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 501–509
- Language:
- URL:
- https://doi.org/10.26615/978-954-452-049-6_066
- DOI:
- 10.26615/978-954-452-049-6_066
- Cite (ACL):
- Maria Mitrofan. 2017. Bootstrapping a Romanian Corpus for Medical Named Entity Recognition. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 501–509, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Bootstrapping a Romanian Corpus for Medical Named Entity Recognition (Mitrofan, RANLP 2017)
- PDF:
- https://doi.org/10.26615/978-954-452-049-6_066