Abstract
This paper addresses hate speech detection in Turkish and Arabic tweets, contributing to the HSD-2Lang Shared Task. We propose a specialized pooling strategy within a soft-voting ensemble framework to improve classification in Turkish and Arabic language models. Our approach also includes expanding the training sets through cross-lingual translation, introducing a broader spectrum of hate speech examples. Our method attains F1-Macro scores of 0.6964 for Turkish (Subtask A) and 0.7123 for Arabic (Subtask B). While achieving these results, we also consider the computational overhead, striking a balance between the effectiveness of our unique pooling strategy, data augmentation, and soft-voting ensemble. This approach advances the practical application of language models in low-resource languages for hate speech detection.- Anthology ID:
- 2024.case-1.28
- Volume:
- Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)
- Month:
- March
- Year:
- 2024
- Address:
- St. Julians, Malta
- Editors:
- Ali Hürriyetoğlu, Hristo Tanev, Surendrabikram Thapa, Gökçe Uludoğan
- Venues:
- CASE | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 199–204
- Language:
- URL:
- https://aclanthology.org/2024.case-1.28
- DOI:
- Cite (ACL):
- Fatima Zahra Qachfar, Bryan Tuck, and Rakesh Verma. 2024. DetectiveReDASers at HSD-2Lang 2024: A New Pooling Strategy with Cross-lingual Augmentation and Ensembling for Hate Speech Detection in Low-resource Languages. In Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024), pages 199–204, St. Julians, Malta. Association for Computational Linguistics.
- Cite (Informal):
- DetectiveReDASers at HSD-2Lang 2024: A New Pooling Strategy with Cross-lingual Augmentation and Ensembling for Hate Speech Detection in Low-resource Languages (Qachfar et al., CASE-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.case-1.28.pdf