HateCheckHIn: Evaluating Hindi Hate Speech Detection Models

Mithun Das, Punyajoy Saha, Binny Mathew, Animesh Mukherjee


Abstract
Due to the sheer volume of online hate, the AI and NLP communities have started building models to detect such hateful content. Recently, multilingual hate is a major emerging challenge for automated detection where code-mixing or more than one language have been used for conversation in social media. Typically, hate speech detection models are evaluated by measuring their performance on the held-out test data using metrics such as accuracy and F1-score. While these metrics are useful, it becomes difficult to identify using them where the model is failing, and how to resolve it. To enable more targeted diagnostic insights of such multilingual hate speech models, we introduce a set of functionalities for the purpose of evaluation. We have been inspired to design this kind of functionalities based on real-world conversation on social media. Considering Hindi as a base language, we craft test cases for each functionality. We name our evaluation dataset HateCheckHIn. To illustrate the utility of these functionalities , we test state-of-the-art transformer based m-BERT model and the Perspective API.
Anthology ID:
2022.lrec-1.575
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5378–5387
Language:
URL:
https://aclanthology.org/2022.lrec-1.575
DOI:
Bibkey:
Cite (ACL):
Mithun Das, Punyajoy Saha, Binny Mathew, and Animesh Mukherjee. 2022. HateCheckHIn: Evaluating Hindi Hate Speech Detection Models. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5378–5387, Marseille, France. European Language Resources Association.
Cite (Informal):
HateCheckHIn: Evaluating Hindi Hate Speech Detection Models (Das et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2022.lrec-1.575.pdf
Code
 hate-alert/hatecheckhin
Data
Hate Speech