Classification of Censored Tweets in Chinese Language using XLNet

Shaikh Sahil Ahmed; Anand Kumar M.

doi:10.18653/v1/2021.nlp4if-1.21

Classification of Censored Tweets in Chinese Language using XLNet

Abstract

In the growth of today’s world and advanced technology, social media networks play a significant role in impacting human lives. Censorship is the overthrowing of speech, public transmission, or other details that play a vast role in social media. The content may be considered harmful, sensitive, or inconvenient. Authorities like institutes, governments, and other organizations conduct Censorship. This paper has implemented a model that helps classify censored and uncensored tweets as a binary classification. The paper describes submission to the Censorship shared task of the NLP4IF 2021 workshop. We used various transformer-based pre-trained models, and XLNet outputs a better accuracy among all. We fine-tuned the model for better performance and achieved a reasonable accuracy, and calculated other performance metrics.

Anthology ID:: 2021.nlp4if-1.21
Volume:: Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda
Month:: June
Year:: 2021
Address:: Online
Editors:: Anna Feldman, Giovanni Da San Martino, Chris Leberknight, Preslav Nakov
Venue:: NLP4IF
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 136–139
Language:
URL:: https://aclanthology.org/2021.nlp4if-1.21
DOI:: 10.18653/v1/2021.nlp4if-1.21
Bibkey:
Cite (ACL):: Shaikh Sahil Ahmed and Anand Kumar M.. 2021. Classification of Censored Tweets in Chinese Language using XLNet. In Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 136–139, Online. Association for Computational Linguistics.
Cite (Informal):: Classification of Censored Tweets in Chinese Language using XLNet (Ahmed & Kumar M., NLP4IF 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/dois-2013-emnlp/2021.nlp4if-1.21.pdf

PDF Search