Amlan Sengupta


2020

pdf
Ssn_nlp at SemEval 2020 Task 12: Offense Target Identification in Social Media Using Traditional and Deep Machine Learning Approaches
Thenmozhi D. | Nandhinee P.r. | Arunima S. | Amlan Sengupta
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Offensive language identification (OLI) in user generated text is automatic detection of any profanity, insult, obscenity, racism or vulgarity that is addressed towards an individual or a group. Due to immense growth and usage of social media, it has an extensive reach and impact on the society. OLI is helpful for hate speech detection, flame detection and cyber bullying, hence it is used to avoid abuse and hurts. In this paper, we present state of the art machine learning approaches for OLI. We follow several approaches which include classifiers like Naive Bayes, Support Vector Machine(SVM) and deep learning approaches like Recurrent Neural Network(RNN) and Masked LM (MLM). The approaches are evaluated on the OffensEval@SemEval2020 dataset and our team ssn_nlp submitted runs for the third task of OffensEval shared task. The best run of ssn_nlp that uses BERT (Bidirectional Encoder Representations from Transformers) for the purpose of training the OLI model obtained F1 score as 0.61. The model performs with an accuracy of 0.80 and an evaluation loss of 1.0828. The model has a precision rate of 0.72 and a recall rate of 0.58.