Viola De Ruvo


2025

pdf bib
Detoxify-IT: An Italian Parallel Dataset for Text Detoxification
Viola De Ruvo | Arianna Muti | Daryna Dementieva | Debora Nozza
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)

Toxic language online poses growing challenges for content moderation. Detoxification, which rewrites toxic content into neutral form, offers a promising alternative but remains underexplored beyond English. We present Detoxify-IT, the first Italian dataset for this task, featuring toxic comments and their human-written neutral rewrites. Our experiments show that even limited fine-tuning on Italian data leads to notable improvements in content preservation and fluency compared to both multilingual models and LLMs used in zero-shot settings, underlining the need for language-specific resources. This work enables detoxification research in Italian and supports broader efforts toward safer, more inclusive online communication.