Debunking with Dialogue? Exploring AI-Generated Counterspeech to Challenge Conspiracy Theories

Mareike Lisker, Christina Gottschalk, Helena Mihaljević


Abstract
Counterspeech is a key strategy against harmful online content, but scaling expert-driven efforts is challenging. Large Language Models (LLMs) present a potential solution, though their use in countering conspiracy theories is under-researched. Unlike for hate speech, no datasets exist that pair conspiracy theory comments with expert-crafted counterspeech. We address this gap by evaluating the ability of GPT-4o, Llama 3, and Mistral to effectively apply counterspeech strategies derived from psychological research provided through structured prompts. Our results show that the models often generate generic, repetitive, or superficial results. Additionally, they over-acknowledge fear and frequently hallucinate facts, sources, or figures, making their prompt-based use in practical applications problematic.
Anthology ID:
2025.woah-1.15
Volume:
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)
Month:
August
Year:
2025
Address:
Vienna, Austria
Editors:
Agostina Calabrese, Christine de Kock, Debora Nozza, Flor Miriam Plaza-del-Arco, Zeerak Talat, Francielle Vargas
Venues:
WOAH | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
163–178
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.woah-1.15/
DOI:
Bibkey:
Cite (ACL):
Mareike Lisker, Christina Gottschalk, and Helena Mihaljević. 2025. Debunking with Dialogue? Exploring AI-Generated Counterspeech to Challenge Conspiracy Theories. In Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH), pages 163–178, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Debunking with Dialogue? Exploring AI-Generated Counterspeech to Challenge Conspiracy Theories (Lisker et al., WOAH 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.woah-1.15.pdf
Supplementarymaterial:
 2025.woah-1.15.SupplementaryMaterial.zip