Partial Parsing as a Method to Expedite Dependency Annotation of a Hindi Treebank
Mridul Gupta, Vineet Yadav, Samar Husain, Dipti Misra Sharma
Abstract
The paper describes an approach to expedite the process of manual annotation of a Hindi dependency treebank which is currently under development. We propose a way by which consistency among a set of manual annotators could be improved. Furthermore, we show that our setup can also prove useful for evaluating when an inexperienced annotator is ready to start participating in the production of the treebank. We test our approach on sample sets of data obtained from an ongoing work on creation of this treebank. The results asserting our proposal are reported in this paper. We report results from a semi-automated approach of dependency annotation experiment. We find out the rate of agreement between annotators using Cohens Kappa. We also compare results with respect to the total time taken to annotate sample data-sets using a completely manual approach as opposed to a semi-automated approach. It is observed from the results that this semi-automated approach when carried out with experienced and trained human annotators improves the overall quality of treebank annotation and also speeds up the process.- Anthology ID:
- L10-1512
- Volume:
- Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
- Month:
- May
- Year:
- 2010
- Address:
- Valletta, Malta
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2010/pdf/739_Paper.pdf
- DOI:
- Cite (ACL):
- Mridul Gupta, Vineet Yadav, Samar Husain, and Dipti Misra Sharma. 2010. Partial Parsing as a Method to Expedite Dependency Annotation of a Hindi Treebank. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
- Cite (Informal):
- Partial Parsing as a Method to Expedite Dependency Annotation of a Hindi Treebank (Gupta et al., LREC 2010)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2010/pdf/739_Paper.pdf