Abstract
In a hate speech detection model, we should consider two critical aspects in addition to detection performance–bias and explainability. Hate speech cannot be identified based solely on the presence of specific words; the model should be able to reason like humans and be explainable. To improve the performance concerning the two aspects, we propose Masked Rationale Prediction (MRP) as an intermediate task. MRP is a task to predict the masked human rationales–snippets of a sentence that are grounds for human judgment–by referring to surrounding tokens combined with their unmasked rationales. As the model learns its reasoning ability based on rationales by MRP, it performs hate speech detection robustly in terms of bias and explainability. The proposed method generally achieves state-of-the-art performance in various metrics, demonstrating its effectiveness for hate speech detection. Warning: This paper contains samples that may be upsetting.- Anthology ID:
- 2022.coling-1.577
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 6644–6655
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.577
- DOI:
- Cite (ACL):
- Jiyun Kim, Byounghan Lee, and Kyung-Ah Sohn. 2022. Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection. In Proceedings of the 29th International Conference on Computational Linguistics, pages 6644–6655, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection (Kim et al., COLING 2022)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2022.coling-1.577.pdf
- Code
- alatteaday/mrp_hate-speech-detection
- Data
- HateXplain