NIT_Agartala_NLP_Team at SemEval-2019 Task 6: An Ensemble Approach to Identifying and Categorizing Offensive Language in Twitter Social Media Corpora

Steve Durairaj Swamy; Anupam Jamatia; Björn Gambäck; Amitava Das

doi:10.18653/v1/S19-2124

NIT_Agartala_NLP_Team at SemEval-2019 Task 6: An Ensemble Approach to Identifying and Categorizing Offensive Language in Twitter Social Media Corpora

Steve Durairaj Swamy, Anupam Jamatia, Björn Gambäck, Amitava Das

Abstract

The paper describes the systems submitted to OffensEval (SemEval 2019, Task 6) on ‘Identifying and Categorizing Offensive Language in Social Media’ by the ‘NIT_Agartala_NLP_Team’. A Twitter annotated dataset of 13,240 English tweets was provided by the task organizers to train the individual models, with the best results obtained using an ensemble model composed of six different classifiers. The ensemble model produced macro-averaged F1-scores of 0.7434, 0.7078 and 0.4853 on Subtasks A, B, and C, respectively. The paper highlights the overall low predictive nature of various linguistic features and surface level count features, as well as the limitations of a traditional machine learning approach when compared to a Deep Learning counterpart.

Anthology ID:: S19-2124
Volume:: Proceedings of the 13th International Workshop on Semantic Evaluation
Month:: June
Year:: 2019
Address:: Minneapolis, Minnesota, USA
Editors:: Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, Saif M. Mohammad
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 696–703
Language:
URL:: https://aclanthology.org/S19-2124
DOI:: 10.18653/v1/S19-2124
Bibkey:
Cite (ACL):: Steve Durairaj Swamy, Anupam Jamatia, Björn Gambäck, and Amitava Das. 2019. NIT_Agartala_NLP_Team at SemEval-2019 Task 6: An Ensemble Approach to Identifying and Categorizing Offensive Language in Twitter Social Media Corpora. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 696–703, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
Cite (Informal):: NIT_Agartala_NLP_Team at SemEval-2019 Task 6: An Ensemble Approach to Identifying and Categorizing Offensive Language in Twitter Social Media Corpora (Swamy et al., SemEval 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/S19-2124.pdf

PDF Search