2019
pdf
bib
abs
CRAFT Shared Tasks 2019 Overview — Integrated Structure, Semantics, and Coreference
William Baumgartner
|
Michael Bada
|
Sampo Pyysalo
|
Manuel R. Ciosici
|
Negacy Hailu
|
Harrison Pielke-Lombardo
|
Michael Regan
|
Lawrence Hunter
Proceedings of The 5th Workshop on BioNLP Open Shared Tasks
As part of the BioNLP Open Shared Tasks 2019, the CRAFT Shared Tasks 2019 provides a platform to gauge the state of the art for three fundamental language processing tasks — dependency parse construction, coreference resolution, and ontology concept identification — over full-text biomedical articles. The structural annotation task requires the automatic generation of dependency parses for each sentence of an article given only the article text. The coreference resolution task focuses on linking coreferring base noun phrase mentions into chains using the symmetrical and transitive identity relation. The ontology concept annotation task involves the identification of concept mentions within text using the classes of ten distinct ontologies in the biomedical domain, both unmodified and augmented with extension classes. This paper provides an overview of each task, including descriptions of the data provided to participants and the evaluation metrics used, and discusses participant results relative to baseline performances for each of the three tasks.
2014
pdf
bib
abs
Sublanguage Corpus Analysis Toolkit: A tool for assessing the representativeness and sublanguage characteristics of corpora
Irina Temnikova
|
William A. Baumgartner Jr.
|
Negacy D. Hailu
|
Ivelina Nikolova
|
Tony McEnery
|
Adam Kilgarriff
|
Galia Angelova
|
K. Bretonnel Cohen
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Sublanguages are varieties of language that form subsets of the general language, typically exhibiting particular types of lexical, semantic, and other restrictions and deviance. SubCAT, the Sublanguage Corpus Analysis Toolkit, assesses the representativeness and closure properties of corpora to analyze the extent to which they are either sublanguages, or representative samples of the general language. The current version of SubCAT contains scripts and applications for assessing lexical closure, morphological closure, sentence type closure, over-represented words, and syntactic deviance. Its operation is illustrated with three case studies concerning scientific journal articles, patents, and clinical records. Materials from two language families are analyzed―English (Germanic), and Bulgarian (Slavic). The software is available at sublanguage.sourceforge.net under a liberal Open Source license.
pdf
bib
Temporal Expression Recognition for Cell Cycle Phase Concepts in Biomedical Literature
Negacy Hailu
|
Natalya Panteleyeva
|
Kevin Cohen
Proceedings of BioNLP 2014
2013
pdf
bib
UColorado_SOM: Extraction of Drug-Drug Interactions from Biomedical Text using Knowledge-rich and Knowledge-poor Features
Negacy Hailu
|
Lawrence E. Hunter
|
K. Bretonnel Cohen
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)
pdf
bib
Measuring Closure Properties of Patent Sublanguages
Irina Temnikova
|
Negacy Hailu
|
Galia Angelova
|
K. Bretonnel Cohen
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013