Abstract
This paper describes our participation in the First Shared Task on Aggression Identification. The method proposed relies on machine learning to identify social media texts which contain aggression. The main features employed by our method are information extracted from word embeddings and the output of a sentiment analyser. Several machine learning methods and different combinations of features were tried. The official submissions used Support Vector Machines and Random Forests. The official evaluation showed that for texts similar to the ones in the training dataset Random Forests work best, whilst for texts which are different SVMs are a better choice. The evaluation also showed that despite its simplicity the method performs well when compared with more elaborated methods.- Anthology ID:
- W18-4414
- Volume:
- Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018)
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Ritesh Kumar, Atul Kr. Ojha, Marcos Zampieri, Shervin Malmasi
- Venue:
- TRAC
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 113–119
- Language:
- URL:
- https://aclanthology.org/W18-4414
- DOI:
- Cite (ACL):
- Constantin Orăsan. 2018. Aggressive Language Identification Using Word Embeddings and Sentiment Features. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pages 113–119, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- Aggressive Language Identification Using Word Embeddings and Sentiment Features (Orăsan, TRAC 2018)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/W18-4414.pdf
- Code
- dinel/aggression_identification