LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation

Junyeong Park, Seogyeong Jeong, Seyoung Song, Yohan Lee, Alice Oh


Abstract
Content moderation platforms concentrate resources on English content despite serving predominantly non-English speaking users.Also, given the scarcity of native moderators for low-resource languages, non-native moderators must bridge this gap in moderation tasks such as hate speech moderation.Through a user study, we identify that non-native moderators struggle with understanding culturally-specific knowledge, sentiment, and internet culture in the hate speech.To assist non-native moderators, we present LLM-C3MOD, a human-LLM collaborative pipeline with three steps: (1) RAG-enhanced cultural context annotations; (2) initial LLM-based moderation; and (3) targeted human moderation for cases lacking LLM consensus.Evaluated on Korean hate speech dataset with Indonesian and German participants, our system achieves 78% accuracy (surpassing GPT-4o’s 71% baseline) while reducing human workload by 83.6%.In addition, cultural context annotations improved non-native moderator accuracy from 22% to 61%, with humans notably excelling at nuanced tasks where LLMs struggle.Our findings demonstrate that non-native moderators, when properly supported by LLMs, can effectively contribute to cross-cultural hate speech moderation.
Anthology ID:
2025.c3nlp-1.7
Volume:
Proceedings of the 3rd Workshop on Cross-Cultural Considerations in NLP (C3NLP 2025)
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Vinodkumar Prabhakaran, Sunipa Dev, Luciana Benotti, Daniel Hershcovich, Yong Cao, Li Zhou, Laura Cabello, Ife Adebara
Venues:
C3NLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
71–88
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.c3nlp-1.7/
DOI:
Bibkey:
Cite (ACL):
Junyeong Park, Seogyeong Jeong, Seyoung Song, Yohan Lee, and Alice Oh. 2025. LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation. In Proceedings of the 3rd Workshop on Cross-Cultural Considerations in NLP (C3NLP 2025), pages 71–88, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation (Park et al., C3NLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.c3nlp-1.7.pdf