Wiebke Petersen

2025

pdf bib abs
Transformer25 at SemEval-2025 Task 1: A similarity-based approach
Wiebke Petersen | Lara Eulenpesch | Ann Piho | Julio Julio | Victoria Lohner
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Accurately representing non-compositional language, such as idiomatic expressions, is essential to avoid misinterpretations that could affect subsequent tasks. This paper presents the submission of Transformer25 to the SemEval 2025 task on advancing the representation of multimodal idiomaticity. This challenge involves matching idiomatic expressions with corresponding image descriptions that depict their meanings.Our system utilizes BERT-based pre-trained sentence embeddings model, ChatGPT-generated definitions and preprocessing. Our final submission ranked 7th out of 9 for Subtask A. The paper provides a system description and analysis of our model, including minimal visualizations.

2024

pdf bib
KlarTextCoders at StaGE: Automatic Statement Annotations for German Easy Language
Akhilesh Kakolu Ramarao | Wiebke Petersen | Anna Sophia Stein | Emma Stein | Hanxin Xia
Proceedings of GermEval 2024 Shared Task on Statement Segmentation in German Easy Language (StaGE)

pdf bib abs
Team art-nat-HHU at SemEval-2024 Task 8: Stylistically Informed Fusion Model for MGT-Detection
Vittorio Ciccarelli | Cornelia Genz | Nele Mastracchio | Wiebke Petersen | Anna Stein | Hanxin Xia
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper presents our solution for subtask A of shared task 8 of SemEval 2024 for classifying human- and machine-written texts in English across multiple domains. We propose a fusion model consisting of RoBERTa based pre-classifier and two MLPs that have been trained to correct the pre-classifier using linguistic features. Our model achieved an accuracy of 85%.

2023

pdf bib abs
hhuEDOS at SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS) Binary Sexism Detection (Subtask A)
Wiebke Petersen | Diem-Ly Tran | Marion Wroblewitz
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

In this paper, we describe SemEval-2023 Task 10, a shared task on detecting and predicting sexist language. The dataset consists of labeled sexist and non-sexist data targeted towards women acquired from both Reddit and Gab. We present and compare several approaches we experimented with and our final submitted model. Additional error analysis is given to recognize challenges we dealt with in our process. A total of 84 teams participated. Our model ranks 55th overall in Subtask A of the shared task.

2022

In this paper, we describe our submission to the ‘Text Complexity DE Challenge 2022’ shared task on predicting the complexity of German sentences. We compare performance of different feature-based regression architectures and transformer language models. Our best candidate is a fine-tuned German Distilbert model that ignores linguistic features of the sentences. Our model ranks 7th place in the shared task.

The paper presents an iterative bidirectional clustering of adjectives and nouns based on a co-occurrence matrix. The clustering method combines a Vector Space Models (VSM) and the results of a Latent Dirichlet Allocation (LDA), whose results are merged in each iterative step. The aim is to derive a clustering of German adjectives that reflects latent semantic classes of adjectives, and that can be used to induce frame-based representations of nouns in a later step. We are able to show that the method induces meaningful groups of adjectives, and that it outperforms a baseline k-means algorithm.