Abstract
Distinguishing hate speech from non-hate offensive language is challenging, as hate speech not always includes offensive slurs and offensive language not always express hate. Here, four deep learners based on the Bidirectional Encoder Representations from Transformers (BERT), with either general or domain-specific language models, were tested against two datasets containing tweets labelled as either ‘Hateful’, ‘Normal’ or ‘Offensive’. The results indicate that the attention-based models profoundly confuse hate speech with offensive and normal language. However, the pre-trained models outperform state-of-the-art results in terms of accurately predicting the hateful instances.- Anthology ID:
- 2020.alw-1.3
- Volume:
- Proceedings of the Fourth Workshop on Online Abuse and Harms
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Venue:
- ALW
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 16–27
- Language:
- URL:
- https://aclanthology.org/2020.alw-1.3
- DOI:
- 10.18653/v1/2020.alw-1.3
- Cite (ACL):
- Vebjørn Isaksen and Björn Gambäck. 2020. Using Transfer-based Language Models to Detect Hateful and Offensive Language Online. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 16–27, Online. Association for Computational Linguistics.
- Cite (Informal):
- Using Transfer-based Language Models to Detect Hateful and Offensive Language Online (Isaksen & Gambäck, ALW 2020)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2020.alw-1.3.pdf
- Data
- BookCorpus, Hate Speech