Abstract
In our work, we present an analysis of the TimeBank corpus---the only available reference sample of TimeML-compliant annotation---from the point of view of its utility as a training resource for developing automated TimeML annotators. We are encouraged by experimental results indicative of the potential of TimeBank; at the same time, closer inspection of causes for some systematic errors shows off certain deficiencies in the corpus, primarily to do with small size and inconsistent annotation. Our analysis suggests that even a reference resource, developed outside of a rigorous process of training corpus design and creation, can be extremely valuable for training and development purposes. The analysis also highlights areas of correction and improvement for evolving the current reference corpus into a community infrastructure resource.- Anthology ID:
- L06-1202
- Volume:
- Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
- Month:
- May
- Year:
- 2006
- Address:
- Genoa, Italy
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/346_pdf.pdf
- DOI:
- Cite (ACL):
- Branimir Boguraev and Rie Kubota Ando. 2006. Analysis of TimeBank as a Resource for TimeML Parsing. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
- Cite (Informal):
- Analysis of TimeBank as a Resource for TimeML Parsing (Boguraev & Ando, LREC 2006)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/346_pdf.pdf