Skadi Dinter
2023
Multilingual Racial Hate Speech Detection Using Transfer Learning
Abinew Ali Ayele
|
Skadi Dinter
|
Seid Muhie Yimam
|
Chris Biemann
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
The rise of social media eases the spread of hateful content, especially racist content with severe consequences. In this paper, we analyze the tweets targeting the death of George Floyd in May 2020 as the event accelerated debates on racism globally. We focus on the tweets published in French for a period of one month since the death of Floyd. Using the Yandex Toloka platform, we annotate the tweets into categories as hate, offensive or normal. Tweets that are offensive or hateful are further annotated as racial or non-racial. We build French hate speech detection models based on the multilingual BERT and CamemBERT and apply transfer learning by fine-tuning the HateXplain model. We compare different approaches to resolve annotation ties and find that the detection model based on CamemBERT yields the best results in our experiments.
Search