Cracking the Code: Enhancing Implicit Hate Speech Detection through Coding Classification

Lu Wei; Liangzhi Li; Tong Xiang; Liu Xiao; Noa Garcia

Cracking the Code: Enhancing Implicit Hate Speech Detection through Coding Classification

Lu Wei, Liangzhi Li, Tong Xiang, Liu Xiao, Noa Garcia

Abstract

The internet has become a hotspot for hate speech (HS), threatening societal harmony and individual well-being. While automatic detection methods perform well in identifying explicit hate speech (ex-HS), they struggle with more subtle forms, such as implicit hate speech (im-HS). We tackle this problem by introducing a new taxonomy for im-HS detection, defining six encoding strategies named *codetypes*. We present two methods for integrating codetypes into im-HS detection: 1) prompting large language models (LLMs) directly to classify sentences based on generated responses, and 2) using LLMs as encoders with codetypes embedded during the encoding process. Experiments show that the use of codetypes improves im-HS detection in both Chinese and English datasets, validating the effectiveness of our approach across different languages.

Anthology ID:: 2025.trustnlp-main.9
Volume:: Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025)
Month:: May
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Trista Cao, Anubrata Das, Tharindu Kumarage, Yixin Wan, Satyapriya Krishna, Ninareh Mehrabi, Jwala Dhamala, Anil Ramakrishna, Aram Galystan, Anoop Kumar, Rahul Gupta, Kai-Wei Chang
Venues:: TrustNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 112–126
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.trustnlp-main.9/
DOI:
Bibkey:
Cite (ACL):: Lu Wei, Liangzhi Li, Tong Xiang, Liu Xiao, and Noa Garcia. 2025. Cracking the Code: Enhancing Implicit Hate Speech Detection through Coding Classification. In Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025), pages 112–126, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: Cracking the Code: Enhancing Implicit Hate Speech Detection through Coding Classification (Wei et al., TrustNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.trustnlp-main.9.pdf

PDF Cite Search Fix data