Schema Learning Corpus: Data and Annotation Focused on Complex Events

Song Chen, Jennifer Tracey, Ann Bies, Stephanie Strassel


Abstract
The Schema Learning Corpus (SLC) is a new linguistic resource designed to support research into the structure of complex events in multilingual, multimedia data. The SLC incorporates large volumes of background data in English, Spanish and Russian, and defines 100 complex events (CEs) across 12 domains, with CE profiles containing information about the typical steps and substeps and expected event categories for the CE. Multiple documents are labeled for each CE, with pointers to evidence in the document for each CE step, plus labeled events and relations along with their arguments across a large tag set. The SLC was designed to support development and evaluation of technology capable of understanding and reasoning about complex real-world events in multimedia, multilingual data streams in order to provide users with a deeper understanding of the potential relationships among seemingly disparate events and actors, and to allow users to make better predictions about how future events are likely to unfold. The Schema Learning Corpus will be made available to the research community through publication in Linguistic Data Consortium catalog.
Anthology ID:
2024.lrec-main.1254
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
14393–14399
Language:
URL:
https://aclanthology.org/2024.lrec-main.1254
DOI:
Bibkey:
Cite (ACL):
Song Chen, Jennifer Tracey, Ann Bies, and Stephanie Strassel. 2024. Schema Learning Corpus: Data and Annotation Focused on Complex Events. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 14393–14399, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Schema Learning Corpus: Data and Annotation Focused on Complex Events (Chen et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2024.lrec-main.1254.pdf