Priyatharshan Balachandran


2025

pdf bib
Towards Effective Emotion Analysis in Low-Resource Tamil Texts
Priyatharshan Balachandran | Uthayasanker Thayasivam | Randil Pushpananda | Ruvan Weerasinghe
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Emotion analysis plays a significant role in understanding human behavior and communication, yet research in Tamil language remains limited. This study focuses on building an emotion classifier for Tamil texts using machine learning (ML) and deep learning (DL), along with creating an emotion-annotated Tamil corpus for Ekman’s basic emotions. Our dataset combines publicly available data with re-annotation and translations. Along with traditional ML models we investigated the use of Transfer Learning (TL) with state-of-the-art models, such as BERT and Electra based models. Experiments were conducted on unbalanced and balanced datasets using data augmentation techniques. The results indicate that MultinomialNaive Bayes (MNB) and Support Vector Machine (SVM) performed well with TF-IDF and BoW representations, while among Transfer Learning models, LaBSE achieved the highest accuracy (63% balanced, 69% unbalanced), followed by TamilBERT and IndicBERT.