Karolina Zaczynska

2023

pdf
Toward a Multilingual Connective Database: Aligning German/French Concessive Connectives
Sophia Rauh | Karolina Zaczynska | Peter Bourgonje
Proceedings of the 19th Conference on Natural Language Processing (KONVENS 2023)

pdf
The UNSC-Graph: An Extensible Knowledge Graph for the UNSC Corpus
Stian Rødven-Eide | Karolina Zaczynska | Antonio Pires | Ronny Patz | Manfred Stede
Proceedings of the 3rd Workshop on Computational Linguistics for the Political and Social Sciences

2022

We present an extension of the SynSemClass Event-type Ontology, originally conceived as a bilingual Czech-English resource. We added German entries to the classes representing the concepts of the ontology. Having a different starting point than the original work (unannotated parallel corpus without links to a valency lexicon and, of course, different existing lexical resources), it was a challenge to adapt the annotation guidelines, the data model and the tools used for the original version. We describe the process and results of working in such a setup. We also show the next steps to adapt the annotation process, data structures and formats and tools necessary to make the addition of a new language in the future more smooth and efficient, and possibly to allow for various teams to work on SynSemClass extensions to many languages concurrently. We also present the latest release which contains the results of adding German, freely available for download as well as for online access.

2021

pdf abs
Fine-grained Classification of Political Bias in German News: A Data Set and Initial Experiments
Dmitrii Aksenov | Peter Bourgonje | Karolina Zaczynska | Malte Ostendorff | Julian Moreno-Schneider | Georg Rehm
Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)

We present a data set consisting of German news articles labeled for political bias on a five-point scale in a semi-supervised way. While earlier work on hyperpartisan news detection uses binary classification (i.e., hyperpartisan or not) and English data, we argue for a more fine-grained classification, covering the full political spectrum (i.e., far-left, left, centre, right, far-right) and for extending research to German data. Understanding political bias helps in accurately detecting hate speech and online abuse. We experiment with different classification methods for political bias detection. Their comparatively low performance (a macro-F1 of 43 for our best setup, compared to a macro-F1 of 79 for the binary classification task) underlines the need for more (balanced) data annotated in a fine-grained way.

pdf
Extraction and Normalization of Vague Time Expressions in German
Ulrike May | Karolina Zaczynska | Julián Moreno-Schneider | Georg Rehm
Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021)