Cihan Karsak
2022
A Comparison of Machine Learning Techniques for Turkish Profanity Detection
Levent Soykan
|
Cihan Karsak
|
Ilknur Durgar Elkahlout
|
Burak Aytan
Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis
Profanity detection became an important task with the increase of social media usage. Most of the users prefer a clean and profanity free environment to communicate with others. In order to provide a such environment for the users, service providers are using various profanity detection tools. In this paper, we researched on Turkish profanity detection in our search engine. We collected and labeled a dataset from search engine queries as one of the two classes: profane and not-profane. We experimented with several classical machine learning and deep learning methods and compared methods in means of speed and accuracy. We performed our best scores with transformer based Electra model with 0.93 F1 Score. We also compared our models with the state-of-the-art Turkish profanity detection tool and observed that we outperform it from all aspects.
Search