Mohan Raj M A
2025
codecrackers@DravidianLangTech 2025: Sentiment Classification in Tamil and Tulu Code-Mixed Social Media Text Using Machine Learning
Lalith Kishore V P
|
Dr G Manikandan
|
Mohan Raj M A
|
Keerthi Vasan A
|
Aravindh M
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Sentiment analysis of code-mixed Dravidian languages has become a major area of concern with increasing volumes of multilingual and code-mixed information across social media. This paper presents the “Seventh Shared Task on Sentiment Analysis in Code-mixed Tamil and Tulu”, which was held as part of DravidianLangTech (NAACL-2025). However, sentiment analysis for code-mixed Dravidian languages has received little attention due to challenges such as class imbalance, small sample size, and the informal nature of the code-mixed text. This study applied an SVM-based approach for the sentiment classification of both Tamil and Tulu languages. The SVM model achieved competitive macro-average F1 scores of 0.54 for Tulu and 0.438 for Tamil, demonstrating that traditional machine learning methods can effectively tackle sentiment categorization in code-mixed languages under low-resource settings.