JU_ETCE_17_21 at SemEval-2019 Task 6: Efficient Machine Learning and Neural Network Approaches for Identifying and Categorizing Offensive Language in Tweets
Preeti Mukherjee, Mainak Pal, Somnath Banerjee, Sudip Kumar Naskar
Abstract
This paper describes our system submissions as part of our participation (team name: JU_ETCE_17_21) in the SemEval 2019 shared task 6: “OffensEval: Identifying and Catego- rizing Offensive Language in Social Media”. We participated in all the three sub-tasks: i) Sub-task A: offensive language identification, ii) Sub-task B: automatic categorization of of- fense types, and iii) Sub-task C: offense target identification. We employed machine learn- ing as well as deep learning approaches for the sub-tasks. We employed Convolutional Neural Network (CNN) and Recursive Neu- ral Network (RNN) Long Short-Term Memory (LSTM) with pre-trained word embeddings. We used both word2vec and Glove pre-trained word embeddings. We obtained the best F1- score using CNN based model for sub-task A, LSTM based model for sub-task B and Lo- gistic Regression based model for sub-task C. Our best submissions achieved 0.7844, 0.5459 and 0.48 F1-scores for sub-task A, sub-task B and sub-task C respectively.- Anthology ID:
- S19-2118
- Volume:
- Proceedings of the 13th International Workshop on Semantic Evaluation
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota, USA
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 662–667
- Language:
- URL:
- https://aclanthology.org/S19-2118
- DOI:
- 10.18653/v1/S19-2118
- Cite (ACL):
- Preeti Mukherjee, Mainak Pal, Somnath Banerjee, and Sudip Kumar Naskar. 2019. JU_ETCE_17_21 at SemEval-2019 Task 6: Efficient Machine Learning and Neural Network Approaches for Identifying and Categorizing Offensive Language in Tweets. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 662–667, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
- Cite (Informal):
- JU_ETCE_17_21 at SemEval-2019 Task 6: Efficient Machine Learning and Neural Network Approaches for Identifying and Categorizing Offensive Language in Tweets (Mukherjee et al., SemEval 2019)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/S19-2118.pdf