Abstract
In the growth of today’s world and advanced technology, social media networks play a significant role in impacting human lives. Censorship is the overthrowing of speech, public transmission, or other details that play a vast role in social media. The content may be considered harmful, sensitive, or inconvenient. Authorities like institutes, governments, and other organizations conduct Censorship. This paper has implemented a model that helps classify censored and uncensored tweets as a binary classification. The paper describes submission to the Censorship shared task of the NLP4IF 2021 workshop. We used various transformer-based pre-trained models, and XLNet outputs a better accuracy among all. We fine-tuned the model for better performance and achieved a reasonable accuracy, and calculated other performance metrics.- Anthology ID:
- 2021.nlp4if-1.21
- Volume:
- Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Editors:
- Anna Feldman, Giovanni Da San Martino, Chris Leberknight, Preslav Nakov
- Venue:
- NLP4IF
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 136–139
- Language:
- URL:
- https://aclanthology.org/2021.nlp4if-1.21
- DOI:
- 10.18653/v1/2021.nlp4if-1.21
- Cite (ACL):
- Shaikh Sahil Ahmed and Anand Kumar M.. 2021. Classification of Censored Tweets in Chinese Language using XLNet. In Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 136–139, Online. Association for Computational Linguistics.
- Cite (Informal):
- Classification of Censored Tweets in Chinese Language using XLNet (Ahmed & Kumar M., NLP4IF 2021)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2021.nlp4if-1.21.pdf