Annotation Process Management Revisited

Dain Kaplan, Ryu Iida, Takenobu Tokunaga


Abstract
Proper annotation process management is crucial to the construction of corpora, which are in turn indispensable to the data-driven techniques that have come to the forefront in NLP during the last two decades. It is still common to see ad-hoc tools created for a specific annotation project, but it is time this changed; creation of such tools is labor and time expensive, and is secondary to corpus creation. In addition, such tools likely lack proper annotation process management, increasingly more important as corpora sizes grow in size and complexity. This paper first raises a list of ten needs that any general purpose annotation system should address moving forward, such as user & role management, delegation & monitoring of work, diffing & merging annotators’ work, versioning of corpora, multilingual support, import/export format flexibility, and so on. A framework to address these needs is then proposed, and how having proper annotation process management can be beneficial to the creation and maintenance of corpora explained. The paper then introduces SLATE (Segment and Link-based Annotation Tool Enhanced), the second iteration of a web-based annotation tool, which is being rewritten to implement the proposed framework.
Anthology ID:
L10-1080
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/129_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Dain Kaplan, Ryu Iida, and Takenobu Tokunaga. 2010. Annotation Process Management Revisited. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Annotation Process Management Revisited (Kaplan et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/129_Paper.pdf