AbuseDetect_Alchemists@DravidianLangTech 2026: A Weighted Transformer Ensemble for Detecting Abusive Tamil Text Targeting Women

Meclin A Francis, Jyoti Kumari, Vinay Babu Ulli, Malavika Sreekumar, Joel Johnson


Abstract
This paper describes our system submitted to the shared task on Abusive Tamil Text Targeting Women on Social Media at DravidianLangTech@ACL 2026. We formulate the problem as a supervised binary classification task, assigning each Tamil social media comment to an Abusive or Non-Abusive category. Our pipeline begins with a tailored preprocessing stage that handles emoji translation, URL removal, and entity normalization. We then independently fine-tune two pre-trained transformer models MuRIL and XLM-RoBERTa on the task data. At inference time, we combine these models through a weighted softmax ensemble, assigning a weight of 0.6 to MuRIL and 0.4 to XLM-RoBERTa. The resulting system achieves a Macro-F1 score of 0.8115 on the test set, outperforming both individual models. The code is publicly available at: https://github.com/meclin2345/AbuseDetect_Alchemists
Anthology ID:
2026.dravidianlangtech-1.16
Volume:
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:
July
Year:
2026
Address:
Underline (Virtual)
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Saranya Rajiakodi, Subalalitha Navaneethakrishnan, Dhivya Chinnappa, Balasubramanian Palani, Malliga Subramanian, Kogilavani Shanmugavadivel, Ratnavel Rajalakshmi
Venues:
DravidianLangTech | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
144–147
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.dravidianlangtech-1.16/
DOI:
Bibkey:
Cite (ACL):
Meclin A Francis, Jyoti Kumari, Vinay Babu Ulli, Malavika Sreekumar, and Joel Johnson. 2026. AbuseDetect_Alchemists@DravidianLangTech 2026: A Weighted Transformer Ensemble for Detecting Abusive Tamil Text Targeting Women. In Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 144–147, Underline (Virtual). Association for Computational Linguistics.
Cite (Informal):
AbuseDetect_Alchemists@DravidianLangTech 2026: A Weighted Transformer Ensemble for Detecting Abusive Tamil Text Targeting Women (Francis et al., DravidianLangTech 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.dravidianlangtech-1.16.pdf