This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
AgataCybulska
Also published as:
Agata Katarzyna Cybulska
Fixing paper assignments
Please select all papers that do not belong to this person.
Indicate below which author they should be assigned to.
In this paper we examine the representativeness of the EventCorefBank (ECB, Bejan and Harabagiu, 2010) with regards to the language population of large-volume streams of news. The ECB corpus is one of the data sets used for evaluation of the task of event coreference resolution. Our analysis shows that the ECB in most cases covers one seminal event per domain, what considerably simplifies event and so language diversity that one comes across in the news. We augmented the corpus with a new corpus component, consisting of 502 texts, describing different instances of event types that were already captured by the 43 topics of the ECB, making it more representative of news articles on the web. The new “ECB+” corpus is available for further research.
In this paper, we report on a study that was performed within the Semantics of History project on how descriptions of historical events are realized in different types of text and what the implications are for modeling the event information. We believe that different historical perspectives of writers correspond in some degree with genre distinction and correlate with variation in language use. To capture differences between event representations in diverse text types and thus to identify relations between historical events, we defined an event model. We observed clear relations between particular parts of event descriptions - actors, time and location modifiers. Texts, written shortly after an event happened, use more specific and uniquely occurring event descriptions than texts describing the same events but written from a longer time perspective. We carried out some statistical corpus research to confirm this hypothesis. The ability to automatically determine relations between historical events and their sub-events over textual data, based on the relations between event participants, time markers and locations, will have important repercussions for the design of historical information retrieval systems.