Detecting Offensive Language in an Open Chatbot Platform

Hyeonho Song, Jisu Hong, Chani Jung, Hyojin Chin, Mingi Shin, Yubin Choi, Junghoi Choi, Meeyoung Cha


Abstract
While detecting offensive language in online spaces remains an important societal issue, there is still a significant gap in existing research and practial datasets specific to chatbots. Furthermore, many of the current efforts by service providers to automatically filter offensive language are vulnerable to users’ deliberate text manipulation tactics, such as misspelling words. In this study, we analyze offensive language patterns in real logs of 6,254,261 chat utterance pairs from the commercial chat service Simsimi, which cover a variety of conversation topics. Based on the observed patterns, we introduce a novel offensive language detection method—a contrastive learning model that embeds chat content with a random masking strategy. We show that this model outperforms existing models in detecting offensive language in open-domain chat conversations while also demonstrating robustness against users’ deliberate text manipulation tactics when using offensive language. We release our curated chatbot dataset to foster research on offensive language detection in open-domain conversations and share lessons learned from mitigating offensive language on a live platform.
Anthology ID:
2024.lrec-main.426
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
4760–4771
Language:
URL:
https://aclanthology.org/2024.lrec-main.426
DOI:
Bibkey:
Cite (ACL):
Hyeonho Song, Jisu Hong, Chani Jung, Hyojin Chin, Mingi Shin, Yubin Choi, Junghoi Choi, and Meeyoung Cha. 2024. Detecting Offensive Language in an Open Chatbot Platform. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 4760–4771, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Detecting Offensive Language in an Open Chatbot Platform (Song et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2024.lrec-main.426.pdf