Resources for Annotating Hate Speech in Social Media Platforms Used in Ethiopia: A Novel Lexicon and Labelling Scheme
Nuhu Ibrahim, Felicity Mulford, Matt Lawrence, Riza Batista-Navarro
Abstract
Hate speech on social media has proliferated in Ethiopia. To support studies aimed at investigating the targets and types of hate speech circulating in the Ethiopian context, we developed a new fine-grained annotation scheme that captures three elements of hate speech: the target (i.e., any groups with protected characteristics), type (i.e., the method of abuse) and nature (i.e., the style of the language used). We also developed a new lexicon of hate speech-related keywords in the four most prominent languages found on Ethiopian social media: Amharic, Afaan Oromo, English and Tigrigna. These keywords enabled us to retrieve social media posts (also in the same four languages) from three platforms (i.e., X, Telegram and Facebook), that are likely to contain hate speech. Experts in the Ethiopian context then manually annotated a sample of those retrieved posts, obtaining fair to moderate inter-annotator agreement. The resulting annotations formed the basis of a case study of which groups tend to be targeted by particular types of hate speech or by particular styles of hate speech language.- Anthology ID:
- 2024.rail-1.13
- Volume:
- Proceedings of the Fifth Workshop on Resources for African Indigenous Languages @ LREC-COLING 2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Rooweither Mabuya, Muzi Matfunjwa, Mmasibidi Setaka, Menno van Zaanen
- Venues:
- RAIL | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 115–123
- Language:
- URL:
- https://preview.aclanthology.org/Author-page-Marten-During-lu/2024.rail-1.13/
- DOI:
- Cite (ACL):
- Nuhu Ibrahim, Felicity Mulford, Matt Lawrence, and Riza Batista-Navarro. 2024. Resources for Annotating Hate Speech in Social Media Platforms Used in Ethiopia: A Novel Lexicon and Labelling Scheme. In Proceedings of the Fifth Workshop on Resources for African Indigenous Languages @ LREC-COLING 2024, pages 115–123, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Resources for Annotating Hate Speech in Social Media Platforms Used in Ethiopia: A Novel Lexicon and Labelling Scheme (Ibrahim et al., RAIL 2024)
- PDF:
- https://preview.aclanthology.org/Author-page-Marten-During-lu/2024.rail-1.13.pdf