The State of Profanity Obfuscation in Natural Language Processing Scientific Publications

Debora Nozza, Dirk Hovy


Abstract
Work on hate speech has made considering rude and harmful examples in scientific publications inevitable. This situation raises various problems, such as whether or not to obscure profanities. While science must accurately disclose what it does, the unwarranted spread of hate speech can harm readers and increases its internet frequency. While maintaining publications’ professional appearance, obfuscating profanities makes it challenging to evaluate the content, especially for non-native speakers. Surveying 150 ACL papers, we discovered that obfuscation is usually used for English but not other languages, and even then, quite unevenly. We discuss the problems with obfuscation and suggest a multilingual community resource called PrOf with a Python module to standardize profanity obfuscation processes. We believe PrOf can help scientific publication policies to make hate speech work accessible and comparable, irrespective of language.
Anthology ID:
2023.findings-acl.240
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3897–3909
Language:
URL:
https://aclanthology.org/2023.findings-acl.240
DOI:
10.18653/v1/2023.findings-acl.240
Bibkey:
Cite (ACL):
Debora Nozza and Dirk Hovy. 2023. The State of Profanity Obfuscation in Natural Language Processing Scientific Publications. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3897–3909, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
The State of Profanity Obfuscation in Natural Language Processing Scientific Publications (Nozza & Hovy, Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2023.findings-acl.240.pdf
Video:
 https://preview.aclanthology.org/naacl-24-ws-corrections/2023.findings-acl.240.mp4