Filtering Aggression from the Multilingual Social Media Feed

Sandip Modha, Prasenjit Majumder, Thomas Mandl


Abstract
This paper describes the participation of team DA-LD-Hildesheim from the Information Retrieval Lab(IRLAB) at DA-IICT Gandhinagar, India in collaboration with the University of Hildesheim, Germany and LDRP-ITR, Gandhinagar, India in a shared task on Aggression Identification workshop in COLING 2018. The objective of the shared task is to identify the level of aggression from the User-Generated contents within Social media written in English, Devnagiri Hindi and Romanized Hindi. Aggression levels are categorized into three predefined classes namely: ‘Overtly Aggressive‘, ‘Covertly Aggressive‘ and ‘Non-aggressive‘. The participating teams are required to develop a multi-class classifier which classifies User-generated content into these pre-defined classes. Instead of relying on a bag-of-words model, we have used pre-trained vectors for word embedding. We have performed experiments with standard machine learning classifiers. In addition, we have developed various deep learning models for the multi-class classification problem. Using the validation data, we found that validation accuracy of our deep learning models outperform all standard machine learning classifiers and voting based ensemble techniques and results on test data support these findings. We have also found that hyper-parameters of the deep neural network are the keys to improve the results.
Anthology ID:
W18-4423
Volume:
Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018)
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Venue:
TRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
199–207
Language:
URL:
https://aclanthology.org/W18-4423
DOI:
Bibkey:
Cite (ACL):
Sandip Modha, Prasenjit Majumder, and Thomas Mandl. 2018. Filtering Aggression from the Multilingual Social Media Feed. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pages 199–207, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Filtering Aggression from the Multilingual Social Media Feed (Modha et al., TRAC 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W18-4423.pdf