Ways of Evaluation of the Annotators in Building the Prague Czech-English Dependency Treebank

Marie Mikulová, Jan Štěpánek

[How to correct problems with metadata yourself]


Abstract
In this paper, we present several ways to measure and evaluate the annotation and annotators, proposed and used during the building of the Czech part of the Prague Czech-English Dependency Treebank. At first, the basic principles of the treebank annotation project are introduced (division to three layers: morphological, analytical and tectogrammatical). The main part of the paper describes in detail one of the important phases of the annotation process: three ways of evaluation of the annotators - inter-annotator agreement, error rate and performance. The measuring of the inter-annotator agreement is complicated by the fact that the data contain added and deleted nodes, making the alignment between annotations non-trivial. The error rate is measured by a set of automatic checking procedures that guard the validity of some invariants in the data. The performance of the annotators is measured by a booking web application. All three measures are later compared and related to each other.
Anthology ID:
L10-1266
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/388_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Marie Mikulová and Jan Štěpánek. 2010. Ways of Evaluation of the Annotators in Building the Prague Czech-English Dependency Treebank. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Ways of Evaluation of the Annotators in Building the Prague Czech-English Dependency Treebank (Mikulová & Štěpánek, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/388_Paper.pdf