HateImgPrompts: Mitigating Generation of Images Spreading Hate Speech
Vineet Kumar Khullar, Venkatesh Velugubantla, Bhanu Prakash Reddy Rella, Mohan Krishna Mannava, Msvpj Sathvik
Abstract
The emergence of artificial intelligence has proven beneficial to numerous organizations, particularly in its various applications for social welfare. One notable application lies in AI-driven image generation tools. These tools produce images based on provided prompts. While this technology holds potential for constructive use, it also carries the risk of being exploited for malicious purposes, such as propagating hate. To address this we propose a novel dataset “HateImgPrompts”. We have benchmarked the dataset with the latest models including GPT-3.5, LLAMA 2, etc. The dataset consists of 9467 prompts and the accuracy of the classifier after finetuning of the dataset is around 81%.- Anthology ID:
- 2025.nlp4dh-1.53
- Volume:
- Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities
- Month:
- May
- Year:
- 2025
- Address:
- Albuquerque, USA
- Editors:
- Mika Hämäläinen, Emily Öhman, Yuri Bizzoni, So Miyagawa, Khalid Alnajjar
- Venues:
- NLP4DH | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 647–652
- Language:
- URL:
- https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.nlp4dh-1.53/
- DOI:
- Cite (ACL):
- Vineet Kumar Khullar, Venkatesh Velugubantla, Bhanu Prakash Reddy Rella, Mohan Krishna Mannava, and Msvpj Sathvik. 2025. HateImgPrompts: Mitigating Generation of Images Spreading Hate Speech. In Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities, pages 647–652, Albuquerque, USA. Association for Computational Linguistics.
- Cite (Informal):
- HateImgPrompts: Mitigating Generation of Images Spreading Hate Speech (Khullar et al., NLP4DH 2025)
- PDF:
- https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.nlp4dh-1.53.pdf