Abstract
Task 12 of SemEval 2020 consisted of 3 subtasks, namely offensive language identification (Subtask A), categorization of offense type (Subtask B), and offense target identification (Subtask C). This paper presents the results our classifiers obtained for the English language in the 3 subtasks. The classifiers used by us were BERT and BiLSTM. On the test set, our BERT classifier obtained macro F1 score of 0.90707 for subtask A, and 0.65279 for subtask B. The BiLSTM classifier obtained macro F1 score of 0.57565 for subtask C. The paper also performs an analysis of the errors made by our classifiers. We conjecture that the presence of few misleading instances in the dataset is affecting the performance of the classifiers. Our analysis also discusses the need of temporal context and world knowledge to determine the offensiveness of few comments.- Anthology ID:
- 2020.semeval-1.204
- Volume:
- Proceedings of the Fourteenth Workshop on Semantic Evaluation
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona (online)
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- International Committee for Computational Linguistics
- Note:
- Pages:
- 1562–1568
- Language:
- URL:
- https://aclanthology.org/2020.semeval-1.204
- DOI:
- 10.18653/v1/2020.semeval-1.204
- Cite (ACL):
- Arup Baruah, Kaushik Das, Ferdous Barbhuiya, and Kuntal Dey. 2020. IIITG-ADBU at SemEval-2020 Task 12: Comparison of BERT and BiLSTM in Detecting Offensive Language. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1562–1568, Barcelona (online). International Committee for Computational Linguistics.
- Cite (Informal):
- IIITG-ADBU at SemEval-2020 Task 12: Comparison of BERT and BiLSTM in Detecting Offensive Language (Baruah et al., SemEval 2020)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2020.semeval-1.204.pdf
- Data
- Hate Speech and Offensive Language