JL-Hate: An Annotated Dataset for Joint Learning of Hate Speech and Target Detection
Kaan Büyükdemirci, Izzet Emre Kucukkaya, Eren Ölmez, Cagri Toraman
Abstract
The detection of hate speech is a subject extensively explored by researchers, and machine learning algorithms play a crucial role in this domain. The existing resources mostly focus on text sequence classification for the task of hate speech detection. However, the target of hateful content is another dimension that has not been studied in details due to the lack of data resources. In this study, we address this gap by introducing a novel tweet dataset for the task of joint learning of hate speech detection and target detection, called JL-Hate, for the tasks of sequential text classification and token classification, respectively. The JL-Hate dataset consists of 1,530 tweets divided equally in English and Turkish languages. Leveraging this dataset, we conduct a series of benchmark experiments. We utilize a joint learning model to concurrently perform sequence and token classification tasks on our data. Our experimental results demonstrate consistent performance with the prevalent studies, both in sequence and token classification tasks.- Anthology ID:
- 2024.lrec-main.834
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 9543–9553
- Language:
- URL:
- https://aclanthology.org/2024.lrec-main.834
- DOI:
- Cite (ACL):
- Kaan Büyükdemirci, Izzet Emre Kucukkaya, Eren Ölmez, and Cagri Toraman. 2024. JL-Hate: An Annotated Dataset for Joint Learning of Hate Speech and Target Detection. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 9543–9553, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- JL-Hate: An Annotated Dataset for Joint Learning of Hate Speech and Target Detection (Büyükdemirci et al., LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/landing_page/2024.lrec-main.834.pdf