BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model

Zhenghao Wu; Hao Zheng; Jianming Wang; Weifeng Su; Jefferson Fong

doi:10.18653/v1/S19-2099

BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model

Zhenghao Wu, Hao Zheng, Jianming Wang, Weifeng Su, Jefferson Fong

[How to correct problems with metadata yourself]

Abstract

In this study we deal with the problem of identifying and categorizing offensive language in social media. Our group, BNU-HKBU UIC NLP Team2, use supervised classification along with multiple version of data generated by different ways of pre-processing the data. We then use the state-of-the-art model Bidirectional Encoder Representations from Transformers, or BERT (Devlin et al, 2018), to capture linguistic, syntactic and semantic features. Long range dependencies between each part of a sentence can be captured by BERT’s bidirectional encoder representations. Our results show 85.12% accuracy and 80.57% F1 scores in Subtask A (offensive language identification), 87.92% accuracy and 50% F1 scores in Subtask B (categorization of offense types), and 69.95% accuracy and 50.47% F1 score in Subtask C (offense target identification). Analysis of the results shows that distinguishing between targeted and untargeted offensive language is not a simple task. More work needs to be done on the unbalance data problem in Subtasks B and C. Some future work is also discussed.

Anthology ID:: S19-2099
Volume:: Proceedings of the 13th International Workshop on Semantic Evaluation
Month:: June
Year:: 2019
Address:: Minneapolis, Minnesota, USA
Editors:: Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, Saif M. Mohammad
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 551–555
Language:
URL:: https://aclanthology.org/S19-2099
DOI:: 10.18653/v1/S19-2099
Bibkey:
Cite (ACL):: Zhenghao Wu, Hao Zheng, Jianming Wang, Weifeng Su, and Jefferson Fong. 2019. BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 551–555, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
Cite (Informal):: BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model (Wu et al., SemEval 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/teach-a-man-to-fish/S19-2099.pdf

PDF Search