JU_ETCE_17_21 at SemEval-2019 Task 6: Efficient Machine Learning and Neural Network Approaches for Identifying and Categorizing Offensive Language in Tweets

Preeti Mukherjee, Mainak Pal, Somnath Banerjee, Sudip Kumar Naskar


Abstract
This paper describes our system submissions as part of our participation (team name: JU_ETCE_17_21) in the SemEval 2019 shared task 6: “OffensEval: Identifying and Catego- rizing Offensive Language in Social Media”. We participated in all the three sub-tasks: i) Sub-task A: offensive language identification, ii) Sub-task B: automatic categorization of of- fense types, and iii) Sub-task C: offense target identification. We employed machine learn- ing as well as deep learning approaches for the sub-tasks. We employed Convolutional Neural Network (CNN) and Recursive Neu- ral Network (RNN) Long Short-Term Memory (LSTM) with pre-trained word embeddings. We used both word2vec and Glove pre-trained word embeddings. We obtained the best F1- score using CNN based model for sub-task A, LSTM based model for sub-task B and Lo- gistic Regression based model for sub-task C. Our best submissions achieved 0.7844, 0.5459 and 0.48 F1-scores for sub-task A, sub-task B and sub-task C respectively.
Anthology ID:
S19-2118
Volume:
Proceedings of the 13th International Workshop on Semantic Evaluation
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota, USA
Editors:
Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, Saif M. Mohammad
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
662–667
Language:
URL:
https://aclanthology.org/S19-2118
DOI:
10.18653/v1/S19-2118
Bibkey:
Cite (ACL):
Preeti Mukherjee, Mainak Pal, Somnath Banerjee, and Sudip Kumar Naskar. 2019. JU_ETCE_17_21 at SemEval-2019 Task 6: Efficient Machine Learning and Neural Network Approaches for Identifying and Categorizing Offensive Language in Tweets. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 662–667, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
Cite (Informal):
JU_ETCE_17_21 at SemEval-2019 Task 6: Efficient Machine Learning and Neural Network Approaches for Identifying and Categorizing Offensive Language in Tweets (Mukherjee et al., SemEval 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/S19-2118.pdf