Abstract
Recently, more and more data have been generated in the online world, filled with offensive language such as threats, swear words or straightforward insults. It is disgraceful for a progressive society, and then the question arises on how language resources and technologies can cope with this challenge. However, previous work only analyzes the problem as a whole but fails to detect particular types of offensive content in a more fine-grained way, mainly because of the lack of annotated data. In this work, we present a densely annotated data-set COLA- Anthology ID:
- 2020.ccl-1.97
- Volume:
- Proceedings of the 19th Chinese National Conference on Computational Linguistics
- Month:
- October
- Year:
- 2020
- Address:
- Haikou, China
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 1045–1056
- Language:
- English
- URL:
- https://aclanthology.org/2020.ccl-1.97
- DOI:
- Cite (ACL):
- Xiangru Tang and Xianjun Shen. 2020. Categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explainable Tool. In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 1045–1056, Haikou, China. Chinese Information Processing Society of China.
- Cite (Informal):
- Categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explainable Tool (Tang & Shen, CCL 2020)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2020.ccl-1.97.pdf