Further Developments in Treebank Error Detection Using Derivation Trees
Abstract
This work describes how derivation tree fragments based on a variant of Tree Adjoining Grammar (TAG) can be used to check treebank consistency. Annotation of word sequences are compared both for their internal structural consistency, and their external relation to the rest of the tree. We expand on earlier work in this area in three ways. First, we provide a more complete description of the system, showing how a naive use of TAG structures will not work, leading to a necessary refinement. We also provide a more complete account of the processing pipeline, including the grouping together of structurally similar errors and their elimination of duplicates. Second, we include the new experimental external relation check to find an additional class of errors. Third, we broaden the evaluation to include both the internal and external relation checks, and evaluate the system on both an Arabic and English treebank. The evaluation has been successful enough that the internal check has been integrated into the standard pipeline for current English treebank construction at the Linguistic Data Consortium- Anthology ID:
- L12-1100
- Volume:
- Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
- Month:
- May
- Year:
- 2012
- Address:
- Istanbul, Turkey
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 1840–1847
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/251_Paper.pdf
- DOI:
- Cite (ACL):
- Seth Kulick, Ann Bies, and Justin Mott. 2012. Further Developments in Treebank Error Detection Using Derivation Trees. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1840–1847, Istanbul, Turkey. European Language Resources Association (ELRA).
- Cite (Informal):
- Further Developments in Treebank Error Detection Using Derivation Trees (Kulick et al., LREC 2012)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/251_Paper.pdf