Abstract
In this paper we describe annotation process of clinical texts with morphosyntactic and semantic information. The corpus contains 1,300 discharge letters in Bulgarian language for patients with Endocrinology and Metabolic disorders. The annotated corpus will be used as a Gold standard for information extraction evaluation of test corpus of 6,200 discharge letters. The annotation is performed within Clark system — an XML Based System For Corpora Development. It provides mechanism for semi-automatic annotation first running a pipeline for Bulgarian morphosyntactic annotation and a cascaded regular grammar for semantic annotation is run, then rules for cleaning of frequent errors are applied. At the end the result is manually checked. At the end we hope also to be able to adapted the morphosyntactic tagger to the domain of clinical narratives as well.- Anthology ID:
- W17-8011
- Volume:
- Proceedings of the Biomedical NLP Workshop associated with RANLP 2017
- Month:
- September
- Year:
- 2017
- Address:
- Varna, Bulgaria
- Editors:
- Svetla Boytcheva, Kevin Bretonnel Cohen, Guergana Savova, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 81–87
- Language:
- URL:
- https://doi.org/10.26615/978-954-452-044-1_011
- DOI:
- 10.26615/978-954-452-044-1_011
- Cite (ACL):
- Ivajlo Radev, Kiril Simov, Galia Angelova, and Svetla Boytcheva. 2017. Annotation of Clinical Narratives in Bulgarian language. In Proceedings of the Biomedical NLP Workshop associated with RANLP 2017, pages 81–87, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Annotation of Clinical Narratives in Bulgarian language (Radev et al., RANLP 2017)
- PDF:
- https://doi.org/10.26615/978-954-452-044-1_011