Detoxifying Online Discourse: A Guided Response Generation Approach for Reducing Toxicity in User-Generated Text

Ritwik Bose, Ian Perera, Bonnie Dorr


Abstract
The expression of opinions, stances, and moral foundations on social media often coincide with toxic, divisive, or inflammatory language that can make constructive discourse across communities difficult. Natural language generation methods could provide a means to reframe or reword such expressions in a way that fosters more civil discourse, yet current Large Language Model (LLM) methods tend towards language that is too generic or formal to seem authentic for social media discussions. We present preliminary work on training LLMs to maintain authenticity while presenting a community’s ideas and values in a constructive, non-toxic manner.
Anthology ID:
2023.sicon-1.2
Volume:
Proceedings of the First Workshop on Social Influence in Conversations (SICon 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Venue:
SICon
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9–14
Language:
URL:
https://aclanthology.org/2023.sicon-1.2
DOI:
10.18653/v1/2023.sicon-1.2
Bibkey:
Cite (ACL):
Ritwik Bose, Ian Perera, and Bonnie Dorr. 2023. Detoxifying Online Discourse: A Guided Response Generation Approach for Reducing Toxicity in User-Generated Text. In Proceedings of the First Workshop on Social Influence in Conversations (SICon 2023), pages 9–14, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Detoxifying Online Discourse: A Guided Response Generation Approach for Reducing Toxicity in User-Generated Text (Bose et al., SICon 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2023.sicon-1.2.pdf