Abstract
This paper describes the submissions of our team, HAD-Tübingen, for the SemEval 2019 - Task 6: “OffensEval: Identifying and Categorizing Offensive Language in Social Media”. We participated in all the three sub-tasks: Sub-task A - “Offensive language identification”, sub-task B - “Automatic categorization of offense types” and sub-task C - “Offense target identification”. As a baseline model we used a Long short-term memory recurrent neural network (LSTM) to identify and categorize offensive tweets. For all the tasks we experimented with external databases in a postprocessing step to enhance the results made by our model. The best macro-average F1 scores obtained for the sub-tasks A, B and C are 0.73, 0.52, and 0.37, respectively.- Anthology ID:
- S19-2111
- Volume:
- Proceedings of the 13th International Workshop on Semantic Evaluation
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota, USA
- Editors:
- Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, Saif M. Mohammad
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 622–627
- Language:
- URL:
- https://aclanthology.org/S19-2111
- DOI:
- 10.18653/v1/S19-2111
- Cite (ACL):
- Himanshu Bansal, Daniel Nagel, and Anita Soloveva. 2019. HAD-Tübingen at SemEval-2019 Task 6: Deep Learning Analysis of Offensive Language on Twitter: Identification and Categorization. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 622–627, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
- Cite (Informal):
- HAD-Tübingen at SemEval-2019 Task 6: Deep Learning Analysis of Offensive Language on Twitter: Identification and Categorization (Bansal et al., SemEval 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/S19-2111.pdf