Abstract
This paper describes a featurized functional dependency corpus automatically derived from the Penn Treebank. Each word in the corpus is associated with over three dozen features describing the functional syntactic structure of a sentence as well as some shallow morphology. The corpus was created for use in probabilistic surface generation, but could also be useful as a resource for the study of English and the development of other NLP applications.- Anthology ID:
- L06-1256
- Volume:
- Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
- Month:
- May
- Year:
- 2006
- Address:
- Genoa, Italy
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/435_pdf.pdf
- DOI:
- Cite (ACL):
- Irene Langkilde-Geary and Justin Betteridge. 2006. A Factored Functional Dependency Transformation of the English Penn Treebank for Probabilistic Surface Generation. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
- Cite (Informal):
- A Factored Functional Dependency Transformation of the English Penn Treebank for Probabilistic Surface Generation (Langkilde-Geary & Betteridge, LREC 2006)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/435_pdf.pdf