Towards Explainable Hate Speech Detection

Happy Khairunnisa Sariyanto; Diclehan Ulucan; Oguzhan Ulucan; Marc Ebner

doi:10.18653/v1/2025.findings-acl.667

Towards Explainable Hate Speech Detection

Happy Khairunnisa Sariyanto, Diclehan Ulucan, Oguzhan Ulucan, Marc Ebner

Abstract

Recent advancements in deep learning have significantly enhanced the efficiency and accuracy of natural language processing (NLP) tasks. However, these models often require substantial computational resources, which remains a major drawback. Reducing the complexity of deep learning architectures, and exploring simpler yet effective approaches can lead to cost-efficient NLP solutions. This is also a step towards explainable AI, i.e., uncovering how a particular task is carried out. For this analysis, we chose the task of hate speech detection. We address hate speech detection by introducing a model that employs a weighted sum of valence, arousal, and dominance (VAD) scores for classification. To determine the optimal weights and classification strategies, we analyze hate speech and non-hate speech words based on both their individual and summed VAD-values. Our experimental results demonstrate that this straightforward approach can compete with state-of-the-art neural network methods, including GPT-based models, in detecting hate speech.

Anthology ID:: 2025.findings-acl.667
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 12883–12893
Language:
URL:: https://preview.aclanthology.org/transition-to-people-yaml/2025.findings-acl.667/
DOI:: 10.18653/v1/2025.findings-acl.667
Bibkey:
Cite (ACL):: Happy Khairunnisa Sariyanto, Diclehan Ulucan, Oguzhan Ulucan, and Marc Ebner. 2025. Towards Explainable Hate Speech Detection. In Findings of the Association for Computational Linguistics: ACL 2025, pages 12883–12893, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Towards Explainable Hate Speech Detection (Sariyanto et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/transition-to-people-yaml/2025.findings-acl.667.pdf

PDF Cite Search Fix data