Theresa Krumbiegel


2022

pdf
NLP4ITF @ Causal News Corpus 2022: Leveraging Linguistic Information for Event Causality Classification
Theresa Krumbiegel | Sophie Decher
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)

We present our submission to Subtask 1 of theCASE-2022 Shared Task 3: Event CausalityIdentification with Causal News Corpus as partof the 5th Workshop on Challenges and Applicationsof Automated Extraction of SociopoliticalEvents from Text (CASE 2022) (Tanet al., 2022a). The task focuses on causal eventclassification on the sentence level and involvesdifferentiating between sentences that include acause-effect relation and sentences that do not.We approached this as a binary text classificationtask and experimented with multiple trainingsets augmented with additional linguisticinformation. Our best model was generated bytraining roberta-base on a combination ofdata from both Subtasks 1 and 2 with the additionof named entity annotations. During thedevelopment phase we achieved a macro F1 of0.8641 with this model on the development setprovided by the task organizers. When testingthe model on the final test data, we achieved amacro F1 of 0.8516.

2021

pdf
FKIE_itf_2021 at CASE 2021 Task 1: Using Small Densely Fully Connected Neural Nets for Event Detection and Clustering
Nils Becker | Theresa Krumbiegel
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

In this paper we present multiple approaches for event detection on document and sentence level, as well as a technique for event sentence co-reference resolution. The advantage of our co-reference resolution approach, which handles the task as a clustering problem, is that we use a single neural net to solve the task, which stands in contrast to other clustering algorithms that often are build on more complex models. This means that we can set our focus on the optimization of a single neural network instead of having to optimize numerous different parameters. We use small densely connected neural networks and pre-trained multilingual transformer embeddings in all subtasks. We use either document or sentence embeddings, depending on the task, and refrain from using word embeddings, so that the implementation of complicated network structures and unfolding of RNNs, which can deal with input of different sizes, is not necessary. We achieved an average macro F1 of 0.65 in subtask 1 (i.e., document level classification), and a macro F1 of 0.70 in subtask 2 (i.e., sentence level classification). For the co-reference resolution subtask, we achieved an average CoNLL-2012 score across all languages of 0.83.

pdf
CASE 2021 Task 2 Socio-political Fine-grained Event Classification using Fine-tuned RoBERTa Document Embeddings
Samantha Kent | Theresa Krumbiegel
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

We present our submission to Task 2 of the Socio-political and Crisis Events Detection Shared Task at the CASE @ ACL-IJCNLP 2021 workshop. The task at hand aims at the fine-grained classification of socio-political events. Our best model was a fine-tuned RoBERTa transformer model using document embeddings. The corpus consisted of a balanced selection of sub-events extracted from the ACLED event dataset. We achieved a macro F-score of 0.923 and a micro F-score of 0.932 during our preliminary experiments on a held-out test set. The same model also performed best on the shared task test data (weighted F-score = 0.83). To analyze the results we calculated the topic compactness of the commonly misclassified events and conducted an error analysis.

2020

pdf
Information Space Dashboard
Theresa Krumbiegel | Albert Pritzkau | Hans-Christian Schmitz
Proceedings for the First International Workshop on Social Threats in Online Conversations: Understanding and Management

The information space, where information is generated, stored, exchanged and discussed, is not idyllic but a space where campaigns of disinformation and destabilization are conducted. Such campaigns are subsumed under the terms “hybrid warfare” and “information warfare” (Woolley and Howard, 2017). In order to enable awareness of them, we propose an information state dashboard comprising various components/apps for data collection, analysis and visualization. The aim of the dashboard is to support an analyst in generating a common operational picture of the information space, link it with an operational picture of the physical space and, thus, contribute to overarching situational awareness. The dashboard is work in progress. However, a first prototype with components for exploiting elementary language statistics, keyword and metadata analysis, text classification and network analysis has been implemented. Further components, in particular, for event extraction and sentiment analysis are under development. As a demonstration case, we briefly discuss the analysis of historical data regarding violent anti-migrant protests and respective counter-protests that took place in Chemnitz in 2018.