Mohan Krishna Mannava
2025
HateImgPrompts: Mitigating Generation of Images Spreading Hate Speech
Vineet Kumar Khullar
|
Venkatesh Velugubantla
|
Bhanu Prakash Reddy Rella
|
Mohan Krishna Mannava
|
Msvpj Sathvik
Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities
The emergence of artificial intelligence has proven beneficial to numerous organizations, particularly in its various applications for social welfare. One notable application lies in AI-driven image generation tools. These tools produce images based on provided prompts. While this technology holds potential for constructive use, it also carries the risk of being exploited for malicious purposes, such as propagating hate. To address this we propose a novel dataset “HateImgPrompts”. We have benchmarked the dataset with the latest models including GPT-3.5, LLAMA 2, etc. The dataset consists of 9467 prompts and the accuracy of the classifier after finetuning of the dataset is around 81%.