Towards Effective Emotion Analysis in Low-Resource Tamil Texts

Priyatharshan Balachandran, Uthayasanker Thayasivam, Randil Pushpananda, Ruvan Weerasinghe


Abstract
Emotion analysis plays a significant role in understanding human behavior and communication, yet research in Tamil language remains limited. This study focuses on building an emotion classifier for Tamil texts using machine learning (ML) and deep learning (DL), along with creating an emotion-annotated Tamil corpus for Ekman’s basic emotions. Our dataset combines publicly available data with re-annotation and translations. Along with traditional ML models we investigated the use of Transfer Learning (TL) with state-of-the-art models, such as BERT and Electra based models. Experiments were conducted on unbalanced and balanced datasets using data augmentation techniques. The results indicate that MultinomialNaive Bayes (MNB) and Support Vector Machine (SVM) performed well with TF-IDF and BoW representations, while among Transfer Learning models, LaBSE achieved the highest accuracy (63% balanced, 69% unbalanced), followed by TamilBERT and IndicBERT.
Anthology ID:
2025.dravidianlangtech-1.101
Volume:
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:
May
Year:
2025
Address:
Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Saranya Rajiakodi, Balasubramanian Palani, Malliga Subramanian, Subalalitha Cn, Dhivya Chinnappa
Venues:
DravidianLangTech | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
585–598
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.dravidianlangtech-1.101/
DOI:
Bibkey:
Cite (ACL):
Priyatharshan Balachandran, Uthayasanker Thayasivam, Randil Pushpananda, and Ruvan Weerasinghe. 2025. Towards Effective Emotion Analysis in Low-Resource Tamil Texts. In Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 585–598, Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Towards Effective Emotion Analysis in Low-Resource Tamil Texts (Balachandran et al., DravidianLangTech 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.dravidianlangtech-1.101.pdf