Marius Wawerek


2025

pdf bib
A Comprehensive Taxonomy of Bias Mitigation Methods for Hate Speech Detection
Jan Fillies | Marius Wawerek | Adrian Paschke
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)

Algorithmic hate speech detection is widely used today. However, biases within these systems can lead to discrimination. This research presents an overview of bias mitigation strategies in the field of hate speech detection. The identified principles are grouped into four categories, based on their operation principles. A novel taxonomy of bias mitigation methods is proposed. The mitigation strategies are characterized based on their key concepts and analyzed in terms of their application stage and their need for knowledge of protected attributes. Additionally, the paper discusses potential combinations of these strategies. This research shifts the focus from identifying present biases to examining the similarities and differences between mitigation strategies, thereby facilitating the exchange, stacking, and ensembling of these strategies in future research.