CUET_Ignite@DravidianLangTech 2025: Detection of Abusive Comments in Tamil Text Using Transformer Models
MD.Mahadi Rahman, Mohammad Minhaj Uddin, Mohammad Shamsul Arefin
Abstract
Abusive comment detection in low-resource languages is a challenging task particularly when addressing gender-based abuse. Identifying abusive language targeting women is crucial for effective content moderation and fostering safer online spaces. A shared task on abusive comment detection in Tamil text organized by DravidianLangTech@NAACL 2025 allowed us to address this challenge using a curated dataset. For this task, we experimented with various machine learning (ML) and deep learning (DL) models including Logistic Regression, Random Forest, SVM, CNN, LSTM, BiLSTMand transformer-based models such as mBERT, IndicBERT, XLMRoBERTa and many more. The dataset comprised of Tamil YouTube comments annotated with binary labels, Abusive and NonAbusive capturing explicit abuse, implicit biases and stereotypes. Our experiments demonstrated that XLM-RoBERTa achieved the highest macro F1-score(0.80), highlighting its effectiveness in handling Tamil text. This research contributes to advancing abusive language detection and natural language processing in lowresource languages particularly for addressing gender-based abuse online.- Anthology ID:
- 2025.dravidianlangtech-1.70
- Volume:
- Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
- Month:
- May
- Year:
- 2025
- Address:
- Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico
- Editors:
- Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Saranya Rajiakodi, Balasubramanian Palani, Malliga Subramanian, Subalalitha Cn, Dhivya Chinnappa
- Venues:
- DravidianLangTech | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 392–397
- Language:
- URL:
- https://preview.aclanthology.org/landing_page/2025.dravidianlangtech-1.70/
- DOI:
- Cite (ACL):
- MD.Mahadi Rahman, Mohammad Minhaj Uddin, and Mohammad Shamsul Arefin. 2025. CUET_Ignite@DravidianLangTech 2025: Detection of Abusive Comments in Tamil Text Using Transformer Models. In Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 392–397, Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- CUET_Ignite@DravidianLangTech 2025: Detection of Abusive Comments in Tamil Text Using Transformer Models (Rahman et al., DravidianLangTech 2025)
- PDF:
- https://preview.aclanthology.org/landing_page/2025.dravidianlangtech-1.70.pdf