Author Profiling for Abuse Detection

Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, Ekaterina Shutova


Abstract
The rapid growth of social media in recent years has fed into some highly undesirable phenomena such as proliferation of hateful and offensive language on the Internet. Previous research suggests that such abusive content tends to come from users who share a set of common stereotypes and form communities around them. The current state-of-the-art approaches to abuse detection are oblivious to user and community information and rely entirely on textual (i.e., lexical and semantic) cues. In this paper, we propose a novel approach to this problem that incorporates community-based profiling features of Twitter users. Experimenting with a dataset of 16k tweets, we show that our methods significantly outperform the current state of the art in abuse detection. Further, we conduct a qualitative analysis of model characteristics. We release our code, pre-trained models and all the resources used in the public domain.
Anthology ID:
C18-1093
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1088–1098
Language:
URL:
https://aclanthology.org/C18-1093
DOI:
Bibkey:
Cite (ACL):
Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, and Ekaterina Shutova. 2018. Author Profiling for Abuse Detection. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1088–1098, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Author Profiling for Abuse Detection (Mishra et al., COLING 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/C18-1093.pdf
Code
 pushkarmishra/AuthorProfilingAbuseDetection