The GeneReg Corpus for Gene Expression Regulation Events — An Overview of the Corpus and its In-Domain and Out-of-Domain Interoperability

Ekaterina Buyko, Elena Beisswanger, Udo Hahn


Abstract
Despite the large variety of corpora in the biomedical domain their annotations differ in many respects, e.g., the coverage of different, highly specialized knowledge domains, varying degrees of granularity of targeted relations, the specificity of linguistic anchoring of relations and named entities in documents, etc. We here present GeneReg (Gene Regulation Corpus), the result of an annotation campaign led by the Jena University Language & Information Engineering (JULIE) Lab. The GeneReg corpus consists of 314 abstracts dealing with the regulation of gene expression in the model organism E. coli. Our emphasis in this paper is on the compatibility of the GeneReg corpus with the alternative Genia event corpus and with several in-domain and out-of-domain lexical resources, e.g., the Specialist Lexicon, FrameNet, and WordNet. The links we established from the GeneReg corpus to these external resources will help improve the performance of the automatic relation extraction engine JREx trained and evaluated on GeneReg.
Anthology ID:
L10-1280
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/407_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Ekaterina Buyko, Elena Beisswanger, and Udo Hahn. 2010. The GeneReg Corpus for Gene Expression Regulation Events — An Overview of the Corpus and its In-Domain and Out-of-Domain Interoperability. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
The GeneReg Corpus for Gene Expression Regulation Events — An Overview of the Corpus and its In-Domain and Out-of-Domain Interoperability (Buyko et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/407_Paper.pdf