Danda Rawat

2025

pdf bib abs
Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate Speech.
Mikel Ngueajio | Flor Miriam Plaza-del-Arco | Yi-Ling Chung | Danda Rawat | Amanda Cercas Curry
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)

Automated counter-narratives (CN) offer a promising strategy for mitigating online hate speech, yet concerns about their affective tone, accessibility, and ethical risks remain. We propose a framework for evaluating Large Language Model (LLM)-generated CNs across four dimensions: persona framing, verbosity and readability, affective tone, and ethical robustness. Using GPT-4o-Mini, Cohere’s CommandR-7B, and Meta’s LLaMA 3.1-70B, we assess three prompting strategies on the MT-Conan and HatEval datasets.Our findings reveal that LLM-generated CNs are often verbose and adapted for people with college-level literacy, limiting their accessibility. While emotionally guided prompts yield more empathetic and readable responses, there remain concerns surrounding safety and effectiveness.

2022

pdf bib abs
Offensive Content Detection via Synthetic Code-Switched Text
Cesa Salaam | Franck Dernoncourt | Trung Bui | Danda Rawat | Seunghyun Yoon
Proceedings of the 29th International Conference on Computational Linguistics

The prevalent use of offensive content in social media has become an important reason for concern for online platforms (customer service chat-boxes, social media platforms, etc). Classifying offensive and hate-speech content in online settings is an essential task in many applications that needs to be addressed accordingly. However, online text from online platforms can contain code-switching, a combination of more than one language. The non-availability of labeled code-switched data for low-resourced code-switching combinations adds difficulty to this problem. To overcome this, we release a real-world dataset containing around 10k samples for testing for three language combinations en-fr, en-es, and en-de, and a synthetic code-switched textual dataset containing ~30k samples for training In this paper, we describe the process for gathering the human-generated data and our algorithm for creating synthetic code-switched offensive content data. We also introduce the results of a keyword classification baseline and a multi-lingual transformer-based classification model.

Co-authors

Flor Miriam Plaza-Del-Arco 1

Cesa Salaam 1

Seunghyun Yoon 1

Venues

Fix author