APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets

Kichang Yang; Wonjun Jang; Won Ik Cho

doi:10.18653/v1/2022.findings-emnlp.525

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets

Abstract

In hate speech detection, developing training and evaluation datasets across various domains is the critical issue. Whereas, major approaches crawl social media texts and hire crowd-workers to annotate the data. Following this convention often restricts the scope of pejorative expressions to a single domain lacking generalization. Sometimes domain overlap between training corpus and evaluation set overestimate the prediction performance when pretraining language models on low-data language. To alleviate these problems in Korean, we propose APEACH that asks unspecified users to generate hate speech examples followed by minimal post-labeling. We find that APEACH can collect useful datasets that are less sensitive to the lexical overlaps between the pretraining corpus and the evaluation set, thereby properly measuring the model performance.

Anthology ID:: 2022.findings-emnlp.525
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2022
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7076–7086
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2022.findings-emnlp.525/
DOI:: 10.18653/v1/2022.findings-emnlp.525
Bibkey:
Cite (ACL):: Kichang Yang, Wonjun Jang, and Won Ik Cho. 2022. APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 7076–7086, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets (Yang et al., Findings 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2022.findings-emnlp.525.pdf
Dataset:: 2022.findings-emnlp.525.dataset.zip
Video:: https://preview.aclanthology.org/fix-sig-urls/2022.findings-emnlp.525.mp4

PDF Cite Search Dataset Video Fix data