Pengcheng Luo
2025
Speaking at the Right Level: Literacy-Controlled Counterspeech Generation with RAG-RL
Xiaoying Song
|
Anirban Saha Anik
|
Dibakar Barua
|
Pengcheng Luo
|
Junhua Ding
|
Lingzi Hong
Findings of the Association for Computational Linguistics: EMNLP 2025
Health misinformation spreading online poses a significant threat to public health. Researchers have explored methods for automatically generating counterspeech to health misinformation as a mitigation strategy. Existing approaches often produce uniform responses, ignoring that the health literacy level of the audience could affect the accessibility and effectiveness of counterspeech. We propose a Controlled-Literacy framework using retrieval-augmented generation (RAG) with reinforcement learning (RL) to generate tailored counterspeech adapted to different health literacy levels. In particular, we retrieve knowledge aligned with specific health literacy levels, enabling accessible and factual information to support generation. We design a reward function incorporating subjective user preferences and objective readability-based rewards to optimize counterspeech to the target health literacy level. Experiment results show that Controlled-Literacy outperforms baselines by generating more accessible and user-preferred counterspeech. This research contributes to more equitable and impactful public health communication by improving the accessibility and comprehension of counterspeech to health misinformation.
2024
Outcome-Constrained Large Language Models for Countering Hate Speech
Lingzi Hong
|
Pengcheng Luo
|
Eduardo Blanco
|
Xiaoying Song
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Automatic counterspeech generation methods have been developed to assist efforts in combating hate speech. Existing research focuses on generating counterspeech with linguistic attributes such as being polite, informative, and intent-driven. However, the real impact of counterspeech in online environments is seldom considered. This study aims to develop methods for generating counterspeech constrained by conversation outcomes and evaluate their effectiveness. We experiment with large language models (LLMs) to incorporate into the text generation process two desired conversation outcomes: low conversation incivility and non-hateful hater reentry. Specifically, we experiment with instruction prompts, LLM finetuning, and LLM reinforcement learning (RL). Evaluation results show that our methods effectively steer the generation of counterspeech toward the desired outcomes. Our analyses, however, show that there are differences in the quality and style depending on the model.
Search
Fix author
Co-authors
- Lingzi Hong 2
- Xiaoying Song 2
- Anirban Saha Anik 1
- Dibakar Barua 1
- Eduardo Blanco 1
- show all...