Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words

Joan Serrà; Ilias Leontiadis; Dimitris Spathis; Gianluca Stringhini; Jeremy Blackburn; Athena Vakali

doi:10.18653/v1/W17-3005

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words

Joan Serrà, Ilias Leontiadis, Dimitris Spathis, Gianluca Stringhini, Jeremy Blackburn, Athena Vakali

Abstract

Common approaches to text categorization essentially rely either on n-gram counts or on word embeddings. This presents important difficulties in highly dynamic or quickly-interacting environments, where the appearance of new words and/or varied misspellings is the norm. A paradigmatic example of this situation is abusive online behavior, with social networks and media platforms struggling to effectively combat uncommon or non-blacklisted hate words. To better deal with these issues in those fast-paced environments, we propose using the error signal of class-based language models as input to text classification algorithms. In particular, we train a next-character prediction model for any given class and then exploit the error of such class-based models to inform a neural network classifier. This way, we shift from the ‘ability to describe’ seen documents to the ‘ability to predict’ unseen content. Preliminary studies using out-of-vocabulary splits from abusive tweet data show promising results, outperforming competitive text categorization strategies by 4-11%.

Anthology ID:: W17-3005
Volume:: Proceedings of the First Workshop on Abusive Language Online
Month:: August
Year:: 2017
Address:: Vancouver, BC, Canada
Editors:: Zeerak Waseem, Wendy Hui Kyong Chung, Dirk Hovy, Joel Tetreault
Venue:: ALW
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 36–40
Language:
URL:: https://aclanthology.org/W17-3005
DOI:: 10.18653/v1/W17-3005
Bibkey:
Cite (ACL):: Joan Serrà, Ilias Leontiadis, Dimitris Spathis, Gianluca Stringhini, Jeremy Blackburn, and Athena Vakali. 2017. Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words. In Proceedings of the First Workshop on Abusive Language Online, pages 36–40, Vancouver, BC, Canada. Association for Computational Linguistics.
Cite (Informal):: Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words (Serrà et al., ALW 2017)
Copy Citation:
PDF:: https://preview.aclanthology.org/proper-vol2-ingestion/W17-3005.pdf

PDF Search