Anbarasan T
2025
KEC-Elite-Analysts@DravidianLangTech 2025: Deciphering Emotions in Tamil-English and Code-Mixed Social Media Tweets
Malliga Subramanian
|
Aruna A
|
Anbarasan T
|
Amudhavan M
|
Jahaganapathi S
|
Kogilavani Shanmugavadivel
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Sentiment analysis in code-mixed languages, particularly Tamil-English, is a growing challenge in natural language processing (NLP) due to the prevalence of multilingual communities on social media. This paper explores various machine learning and transformer-based models, including Logistic Regression, Support Vector Machines (SVM), K-Nearest Neighbors (KNN), BERT, and mBERT, for sentiment classification of Tamil-English code-mixed text. The models are evaluated on a shared task dataset provided by DravidianLangTech@NAACL 2025, with performance measured through accuracy, precision, recall, and F1-score. Our results demonstrate that transformer-based models, particularly mBERT, outperform traditional classifiers in identifying sentiment polarity. Future work aims to address the challenges posed by code-switching and class imbalance through advanced model architectures and data augmentation techniques.