Nandhini B
2026
Lannisters@DravidianLangTech 2026: A Comparative and Ablation Study of Multilingual Transformers for Gender-Targeted Abuse Detection in Tamil Social Media Platforms
Kalaivani K S | Jaisanth K | Nandhini B
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Kalaivani K S | Jaisanth K | Nandhini B
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
The prevalence of the use of the Tamil lan- guage on social media has heightened the need to address the issue of online harassment of women. As a result, there is a heightened need to develop a system to automatically iden- tify abusive content in the Tamil language to promote a safe online communication plat- form. This paper presents a model to iden- tify abusive content using a binary classifi- cation model to identify Abusive and Non- Abusive content. In this work, we experi- mented with several multilingual transformer models including DistilBERT, mBERT, and XLM-RoBERTa. From the experiments, it was observed that the XLM-RoBERTa model performed better than the others, achieving an accuracy of 91.17% and a macro F1 score of 0.8865. In this paper, ablation experiments are conducted to show that structured preprocess- ing, balancing the minority class, and tuning the hyperparameters contribute to the model’s performance