Abstract
In this paper we discuss five different corpora annotated forprotein names. We present several within- and cross-dataset proteintagging experiments showing that different annotation schemes severelyaffect the portability of statistical protein taggers. By means of adetailed error analysis we identify crucial annotation issues thatfuture annotation projects should take into careful consideration.- Anthology ID:
- L06-1235
- Volume:
- Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
- Month:
- May
- Year:
- 2006
- Address:
- Genoa, Italy
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/398_pdf.pdf
- DOI:
- Cite (ACL):
- Beatrice Alex, Malvina Nissim, and Claire Grover. 2006. The Impact of Annotation on the Performance of Protein Tagging in Biomedical Text. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
- Cite (Informal):
- The Impact of Annotation on the Performance of Protein Tagging in Biomedical Text (Alex et al., LREC 2006)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/398_pdf.pdf