A Crowdsourced Database of Event Sequence Descriptions for the Acquisition of High-quality Script Knowledge
Lilian D. A. Wanzare, Alessandra Zarcone, Stefan Thater, Manfred Pinkal
Abstract
Scripts are standardized event sequences describing typical everyday activities, which play an important role in the computational modeling of cognitive abilities (in particular for natural language processing). We present a large-scale crowdsourced collection of explicit linguistic descriptions of script-specific event sequences (40 scenarios with 100 sequences each). The corpus is enriched with crowdsourced alignment annotation on a subset of the event descriptions, to be used in future work as seed data for automatic alignment of event descriptions (for example via clustering). The event descriptions to be aligned were chosen among those expected to have the strongest corrective effect on the clustering algorithm. The alignment annotation was evaluated against a gold standard of expert annotators. The resulting database of partially-aligned script-event descriptions provides a sound empirical basis for inducing high-quality script knowledge, as well as for any task involving alignment and paraphrase detection of events.- Anthology ID:
- L16-1556
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 3494–3501
- Language:
- URL:
- https://aclanthology.org/L16-1556
- DOI:
- Cite (ACL):
- Lilian D. A. Wanzare, Alessandra Zarcone, Stefan Thater, and Manfred Pinkal. 2016. A Crowdsourced Database of Event Sequence Descriptions for the Acquisition of High-quality Script Knowledge. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3494–3501, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- A Crowdsourced Database of Event Sequence Descriptions for the Acquisition of High-quality Script Knowledge (Wanzare et al., LREC 2016)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/L16-1556.pdf