ByteBuilders@DravidianLangTech 2026: Transformer-Based Weighted Ensemble for Political Multiclass Sentiment Analysis of Tamil X (Twitter) Comments

Mitharshana T V, Shanthi S, Lavana V, Kaviya Varma R


Abstract
Our proposal for the Dravidian LangTech 2026 Tamil Political Sentiment Analysis job is outlined in this document. Seven categories—substantiated, sarcastic, opinionated, positive, negative, neutral, and none of the above—should be used to classify Tamil political remarks according to their attitudes. Classifying the sentiments of Tamil political utterances is quite difficult. Furthermore, the emotions associated with various identities are not distributed uniformly. We built an ensemble of two transformer-based techniques, XLM-RoBERTa and IndicBERT, and used 10-fold cross-validation to improve the model’s dependability and prevent overfitting in order to address some of these issues while finishing this research. In order to help the model concentrate more on the challenging examples, used oversampling to address class imbalance and Focal Loss to train the model. In order to improve the representation of sentences, finally averaged the token embeddings.
Anthology ID:
2026.dravidianlangtech-1.20
Volume:
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:
July
Year:
2026
Address:
Underline (Virtual)
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Saranya Rajiakodi, Subalalitha Navaneethakrishnan, Dhivya Chinnappa, Balasubramanian Palani, Malliga Subramanian, Kogilavani Shanmugavadivel, Ratnavel Rajalakshmi
Venues:
DravidianLangTech | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
163–168
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.dravidianlangtech-1.20/
DOI:
Bibkey:
Cite (ACL):
Mitharshana T V, Shanthi S, Lavana V, and Kaviya Varma R. 2026. ByteBuilders@DravidianLangTech 2026: Transformer-Based Weighted Ensemble for Political Multiclass Sentiment Analysis of Tamil X (Twitter) Comments. In Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 163–168, Underline (Virtual). Association for Computational Linguistics.
Cite (Informal):
ByteBuilders@DravidianLangTech 2026: Transformer-Based Weighted Ensemble for Political Multiclass Sentiment Analysis of Tamil X (Twitter) Comments (V et al., DravidianLangTech 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.dravidianlangtech-1.20.pdf