To Paraphrase or Not: Efficient Comment Detoxification with Unsupervised Detoxifiability Discrimination

Jing Ke, Zheyong Xie, Shaosheng Cao, Tong Xu, Enhong Chen


Abstract
Mitigating toxic content is critical for maintaining a healthy social platform, yet existing detoxification systems face significant limitations: overcorrection from uniformly processing all toxic comments, and parallel data scarcity in paraphrasing model training. To tackle these challenges, we propose Detoxifiability-Aware Detoxification (DID), a novel paradigm that adaptively conducts filtering or paraphrasing for each toxic comment based on its detoxifiability, namely whether it can be paraphrased into a benign comment in essence. Specifically, DID integrates three core modules: (1) an unsupervised detoxifiability discriminator, (2) a semantic purification module that extracts harmful intents and then performs targeted paraphrasing only on detoxifiable comments and (3) a feedback-adaptive refinement loop that processes remaining harmful contents only when they are detoxifiable. Experimental results demonstrate that DID significantly outperforms existing approaches on academic data and an industrial platform, establishing a novel and practical modeling paradigm for comment detoxification.
Anthology ID:
2026.eacl-short.14
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
207–213
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-short.14/
DOI:
Bibkey:
Cite (ACL):
Jing Ke, Zheyong Xie, Shaosheng Cao, Tong Xu, and Enhong Chen. 2026. To Paraphrase or Not: Efficient Comment Detoxification with Unsupervised Detoxifiability Discrimination. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers), pages 207–213, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
To Paraphrase or Not: Efficient Comment Detoxification with Unsupervised Detoxifiability Discrimination (Ke et al., EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-short.14.pdf
Checklist:
 2026.eacl-short.14.checklist.pdf