KEC_AI_VSS_run2@DravidianLangTech 2025: Abusive Tamil and Malayalam Text targeting Women on Social Media
Kogilavani Shanmugavadivel, Malliga Subramanian, Sathiyaseelan S, Suresh Babu K, Vasikaran S
Abstract
The increasing instances of abusive language against women on social media platforms have brought to the fore the need for effective content moderation systems, especially in low-resource languages like Tamil and Malayalam. This paper addresses the challenge of detecting gender-based abuse in YouTube comments using annotated datasets in these languages. Comments are classified into abusive and non-abusive categories. We applied the following machine learning algorithms, namely Random Forest, Support Vector Machine, K-Nearest Neighbor, Gradient Boosting and AdaBoost for classification. Micro F1 score of 0.95 was achieved by SVM for Tamil and 0.72 by Random Forest for Malayalam. Our system participated in the shared task on abusive comment detection, out of 160 teams achieving the rank of 13th for Malayalam and rank 34 for Tamil, and both indicate both the challenges and potential of our approach in low-resource language processing. Our findings have highlighted the significance of tailored approaches to language-specific abuse detection.- Anthology ID:
- 2025.dravidianlangtech-1.34
- Volume:
- Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
- Month:
- May
- Year:
- 2025
- Address:
- Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico
- Editors:
- Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Saranya Rajiakodi, Balasubramanian Palani, Malliga Subramanian, Subalalitha Cn, Dhivya Chinnappa
- Venues:
- DravidianLangTech | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 200–204
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.dravidianlangtech-1.34/
- DOI:
- Cite (ACL):
- Kogilavani Shanmugavadivel, Malliga Subramanian, Sathiyaseelan S, Suresh Babu K, and Vasikaran S. 2025. KEC_AI_VSS_run2@DravidianLangTech 2025: Abusive Tamil and Malayalam Text targeting Women on Social Media. In Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 200–204, Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- KEC_AI_VSS_run2@DravidianLangTech 2025: Abusive Tamil and Malayalam Text targeting Women on Social Media (Shanmugavadivel et al., DravidianLangTech 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.dravidianlangtech-1.34.pdf