Abstract
In social media, there are instances where people present their opinions in strong language, resorting to abusive/toxic comments.There are instances of communal hatred, hate-speech, toxicity and bullying. And, in this age of social media, it’s very important to find means to keep check on these toxic comments, as to preserve the mental peace of people in social media.While there are tools, models to detect andpotentially filter these kind of content, developing these kinds of models for the low resource language space is an issue of research.In this paper, the task of abusive comment identification in Tamil language, is seen upon as a multi-class classification problem.There are different pre-processing as well as modelling approaches discussed in this paper.The different approaches are compared on the basis of weighted average accuracy.- Anthology ID:
- 2022.dravidianlangtech-1.33
- Volume:
- Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Venue:
- DravidianLangTech
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 214–220
- Language:
- URL:
- https://aclanthology.org/2022.dravidianlangtech-1.33
- DOI:
- 10.18653/v1/2022.dravidianlangtech-1.33
- Cite (ACL):
- Aanisha Bhattacharyya. 2022. Aanisha@TamilNLP-ACL2022:Abusive Detection in Tamil. In Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pages 214–220, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Aanisha@TamilNLP-ACL2022:Abusive Detection in Tamil (Bhattacharyya, DravidianLangTech 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.dravidianlangtech-1.33.pdf