Onur Uca


2023

pdf
RECESS: Resource for Extracting Cause, Effect, and Signal Spans
Fiona Anting Tan | Hansi Hettiarachchi | Ali Hürriyetoğlu | Nelleke Oostdijk | Tommaso Caselli | Tadashi Nomoto | Onur Uca | Farhana Ferdousi Liza | See-Kiong Ng
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
On the Road to a Protest Event Ontology for Bulgarian: Conceptual Structures and Representation Design
Milena Slavcheva | Hristo Tanev | Onur Uca
Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text

The paper presents a semantic model of protest events, called Semantic Interpretations of Protest Events (SemInPE). The analytical framework used for building the semantic representations is inspired by the object-oriented paradigm in computer science and a cognitive approach to the linguistic analysis. The model is a practical application of the Unified Eventity Representation (UER) formalism, which is based on the Unified Modeling Language (UML). The multi-layered architecture of the model provides flexible means for building the semantic representations of the language objects along a scale of generality and specificity. Thus, it is a suitable environment for creating the elements of ontologies on various topics and for different languages.

pdf
Event Causality Identification - Shared Task 3, CASE 2023
Fiona Anting Tan | Hansi Hettiarachchi | Ali Hürriyetoğlu | Nelleke Oostdijk | Onur Uca | Surendrabikram Thapa | Farhana Ferdousi Liza
Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text

The Event Causality Identification Shared Task of CASE 2023 is the second iteration of a shared task centered around the Causal News Corpus. Two subtasks were involved: In Subtask 1, participants were challenged to predict if a sentence contains a causal relation or not. In Subtask 2, participants were challenged to identify the Cause, Effect, and Signal spans given an input causal sentence. For both subtasks, participants uploaded their predictions for a held-out test set, and ranking was done based on binary F1 and macro F1 scores for Subtask 1 and 2, respectively. This paper includes an overview of the work of the ten teams that submitted their results to our competition and the six system description papers that were received. The highest F1 scores achieved for Subtask 1 and 2 were 84.66% and 72.79%, respectively.

pdf
Detecting and Geocoding Battle Events from Social Media Messages on the Russo-Ukrainian War: Shared Task 2, CASE 2023
Hristo Tanev | Nicolas Stefanovitch | Andrew Halterman | Onur Uca | Vanni Zavarella | Ali Hurriyetoglu | Bertrand De Longueville | Leonida Della Rocca
Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text

The purpose of the shared task 2 at the Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) 2023 workshop was to test the abilities of the participating models and systems to detect and geocode armed conflicts events in social media messages from Telegram channels reporting on the Russo Ukrainian war. The evaluation followed an approach which was introduced in CASE 2021 (Giorgi et al., 2021): For each system we consider the correlation of the spatio-temporal distribution of its detected events and the events identified for the same period in the ACLED (Armed Conflict Location and Event Data Project) database (Raleigh et al., 2010). We use ACLED for the ground truth, since it is a well established standard in the field of event extraction and political trend analysis, which relies on human annotators for the encoding of security events using a fine grained taxonomy. Two systems participated in this shared task, we report in this paper on both the shared task and the participating systems.

2022

pdf
Event Causality Identification with Causal News Corpus - Shared Task 3, CASE 2022
Fiona Anting Tan | Hansi Hettiarachchi | Ali Hürriyetoğlu | Tommaso Caselli | Onur Uca | Farhana Ferdousi Liza | Nelleke Oostdijk
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)

The Event Causality Identification Shared Task of CASE 2022 involved two subtasks working on the Causal News Corpus. Subtask 1 required participants to predict if a sentence contains a causal relation or not. This is a supervised binary classification task. Subtask 2 required participants to identify the Cause, Effect and Signal spans per causal sentence. This could be seen as a supervised sequence labeling task. For both subtasks, participants uploaded their predictions for a held-out test set, and ranking was done based on binary F1 and macro F1 scores for Subtask 1 and 2, respectively. This paper summarizes the work of the 17 teams that submitted their results to our competition and 12 system description papers that were received. The best F1 scores achieved for Subtask 1 and 2 were 86.19% and 54.15%, respectively. All the top-performing approaches involved pre-trained language models fine-tuned to the targeted task. We further discuss these approaches and analyze errors across participants’ systems in this paper.

pdf
Extended Multilingual Protest News Detection - Shared Task 1, CASE 2021 and 2022
Ali Hürriyetoğlu | Osman Mutlu | Fırat Duruşan | Onur Uca | Alaeddin Gürel | Benjamin J. Radford | Yaoyao Dai | Hansi Hettiarachchi | Niklas Stoehr | Tadashi Nomoto | Milena Slavcheva | Francielle Vargas | Aaqib Javid | Fatih Beyhan | Erdem Yörük
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)

We report results of the CASE 2022 Shared Task 1 on Multilingual Protest Event Detection. This task is a continuation of CASE 2021 that consists of four subtasks that are i) document classification, ii) sentence classification, iii) event sentence coreference identification, and iv) event extraction. The CASE 2022 extension consists of expanding the test data with more data in previously available languages, namely, English, Hindi, Portuguese, and Spanish, and adding new test data in Mandarin, Turkish, and Urdu for Sub-task 1, document classification. The training data from CASE 2021 in English, Portuguese and Spanish were utilized. Therefore, predicting document labels in Hindi, Mandarin, Turkish, and Urdu occurs in a zero-shot setting. The CASE 2022 workshop accepts reports on systems developed for predicting test data of CASE 2021 as well. We observe that the best systems submitted by CASE 2022 participants achieve between 79.71 and 84.06 F1-macro for new languages in a zero-shot setting. The winning approaches are mainly ensembling models and merging data in multiple languages. The best two submissions on CASE 2021 data outperform submissions from last year for Subtask 1 and Subtask 2 in all languages. Only the following scenarios were not outperformed by new submissions on CASE 2021: Subtask 3 Portuguese & Subtask 4 English.

pdf
The Causal News Corpus: Annotating Causal Relations in Event Sentences from News
Fiona Anting Tan | Ali Hürriyetoğlu | Tommaso Caselli | Nelleke Oostdijk | Tadashi Nomoto | Hansi Hettiarachchi | Iqra Ameer | Onur Uca | Farhana Ferdousi Liza | Tiancheng Hu
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Despite the importance of understanding causality, corpora addressing causal relations are limited. There is a discrepancy between existing annotation guidelines of event causality and conventional causality corpora that focus more on linguistics. Many guidelines restrict themselves to include only explicit relations or clause-based arguments. Therefore, we propose an annotation schema for event causality that addresses these concerns. We annotated 3,559 event sentences from protest event news with labels on whether it contains causal relations or not. Our corpus is known as the Causal News Corpus (CNC). A neural network built upon a state-of-the-art pre-trained language model performed well with 81.20% F1 score on test set, and 83.46% in 5-folds cross-validation. CNC is transferable across two external corpora: CausalTimeBank (CTB) and Penn Discourse Treebank (PDTB). Leveraging each of these external datasets for training, we achieved up to approximately 64% F1 on the CNC test set without additional fine-tuning. CNC also served as an effective training and pre-training dataset for the two external corpora. Lastly, we demonstrate the difficulty of our task to the layman in a crowd-sourced annotation exercise. Our annotated corpus is publicly available, providing a valuable resource for causal text mining researchers.