Xianjun Shen


Categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explainable Tool
Xiangru Tang | Xianjun Shen
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Recently, more and more data have been generated in the online world, filled with offensive language such as threats, swear words or straightforward insults. It is disgraceful for a progressive society, and then the question arises on how language resources and technologies can cope with this challenge. However, previous work only analyzes the problem as a whole but fails to detect particular types of offensive content in a more fine-grained way, mainly because of the lack of annotated data. In this work, we present a densely annotated data-set COLA