HausaHate: An Expert Annotated Corpus for Hausa Hate Speech Detection

Francielle Vargas, Samuel Guimarães, Shamsuddeen Hassan Muhammad, Diego Alves, Ibrahim Said Ahmad, Idris Abdulmumin, Diallo Mohamed, Thiago Pardo, Fabrício Benevenuto


Abstract
We introduce the first expert annotated corpus of Facebook comments for Hausa hate speech detection. The corpus titled HausaHate comprises 2,000 comments extracted from Western African Facebook pages and manually annotated by three Hausa native speakers, who are also NLP experts. Our corpus was annotated using two different layers. We first labeled each comment according to a binary classification: offensive versus non-offensive. Then, offensive comments were also labeled according to hate speech targets: race, gender and none. Lastly, a baseline model using fine-tuned LLM for Hausa hate speech detection is presented, highlighting the challenges of hate speech detection tasks for indigenous languages in Africa, as well as future advances.
Anthology ID:
2024.woah-1.5
Volume:
Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Yi-Ling Chung, Zeerak Talat, Debora Nozza, Flor Miriam Plaza-del-Arco, Paul Röttger, Aida Mostafazadeh Davani, Agostina Calabrese
Venues:
WOAH | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
52–58
Language:
URL:
https://aclanthology.org/2024.woah-1.5
DOI:
Bibkey:
Cite (ACL):
Francielle Vargas, Samuel Guimarães, Shamsuddeen Hassan Muhammad, Diego Alves, Ibrahim Said Ahmad, Idris Abdulmumin, Diallo Mohamed, Thiago Pardo, and Fabrício Benevenuto. 2024. HausaHate: An Expert Annotated Corpus for Hausa Hate Speech Detection. In Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024), pages 52–58, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
HausaHate: An Expert Annotated Corpus for Hausa Hate Speech Detection (Vargas et al., WOAH-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.woah-1.5.pdf