Vebjørn Isaksen


Using Transfer-based Language Models to Detect Hateful and Offensive Language Online
Vebjørn Isaksen | Björn Gambäck
Proceedings of the Fourth Workshop on Online Abuse and Harms

Distinguishing hate speech from non-hate offensive language is challenging, as hate speech not always includes offensive slurs and offensive language not always express hate. Here, four deep learners based on the Bidirectional Encoder Representations from Transformers (BERT), with either general or domain-specific language models, were tested against two datasets containing tweets labelled as either ‘Hateful’, ‘Normal’ or ‘Offensive’. The results indicate that the attention-based models profoundly confuse hate speech with offensive and normal language. However, the pre-trained models outperform state-of-the-art results in terms of accurately predicting the hateful instances.