KECEmpower@DravidianLangTech 2025: Abusive Tamil and Malayalam Text targeting Women on Social Media
Malliga Subramanian, Kogilavani Shanmugavadivel, Indhuja V S, Kowshik P, Jayasurya S
Abstract
The detection of abusive text targeting women, especially in Dravidian languages like Tamil and Malayalam, presents a unique challenge due to linguistic complexities and code-mixing on social media. This paper evaluates machine learning models such as Support Vector Machines (SVM), Logistic Regression (LR), and Random Forest Classifiers (RFC) for identifying abusive content. Code-mixed datasets sourced from platforms like YouTube are used to train and test the models. Performance is evaluated using accuracy, precision, recall, and F1-score metrics. Our findings show that SVM outperforms the other classifiers in accuracy and recall. However, challenges persist in detecting implicit abuse and addressing informal, culturally nuanced language. Future work will explore transformer-based models like BERT for better context understanding, along with data augmentation techniques to enhance model performance. Additionally, efforts will focus on expanding labeled datasets to improve abuse detection in these low-resource languages.- Anthology ID:
- 2025.dravidianlangtech-1.30
- Volume:
- Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
- Month:
- May
- Year:
- 2025
- Address:
- Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico
- Editors:
- Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Saranya Rajiakodi, Balasubramanian Palani, Malliga Subramanian, Subalalitha Cn, Dhivya Chinnappa
- Venues:
- DravidianLangTech | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 178–181
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.dravidianlangtech-1.30/
- DOI:
- Cite (ACL):
- Malliga Subramanian, Kogilavani Shanmugavadivel, Indhuja V S, Kowshik P, and Jayasurya S. 2025. KECEmpower@DravidianLangTech 2025: Abusive Tamil and Malayalam Text targeting Women on Social Media. In Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 178–181, Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- KECEmpower@DravidianLangTech 2025: Abusive Tamil and Malayalam Text targeting Women on Social Media (Subramanian et al., DravidianLangTech 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.dravidianlangtech-1.30.pdf