MysticCIOL@DravidianLangTech 2025: A Hybrid Framework for Sentiment Analysis in Tamil and Tulu Using Fine-Tuned SBERT Embeddings and Custom MLP Architectures
Minhaz Chowdhury, Arnab Laskar, Taj Ahmad, Azmine Toushik Wasi
Abstract
Sentiment analysis is a crucial NLP task used to analyze opinions in various domains, including marketing, politics, and social media. While transformer-based models like BERT and SBERT have significantly improved sentiment classification, their effectiveness in low-resource languages remains limited. Tamil and Tulu, despite their widespread use, suffer from data scarcity, dialectal variations, and code-mixing challenges, making sentiment analysis difficult. Existing methods rely on traditional classifiers or word embeddings, which struggle to generalize in these settings. To address this, we propose a hybrid framework that integrates fine-tuned SBERT embeddings with a Multi-Layer Perceptron (MLP) classifier, enhancing contextual representation and classification robustness. Our framework achieves validation F1-scores of 0.4218 for Tamil and 0.3935 for Tulu and test F1-scores of 0.4299 in Tamil and 0.1546 on Tulu, demonstrating its effectiveness. This research provides a scalable solution for sentiment classification in low-resource languages, with future improvements planned through data augmentation and transfer learning. Our full experimental codebase is publicly available at: github.com/ciol-researchlab/NAACL25-Mystic-Tamil-Sentiment-Analysis.- Anthology ID:
- 2025.dravidianlangtech-1.28
- Volume:
- Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
- Month:
- May
- Year:
- 2025
- Address:
- Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico
- Editors:
- Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Saranya Rajiakodi, Balasubramanian Palani, Malliga Subramanian, Subalalitha Cn, Dhivya Chinnappa
- Venues:
- DravidianLangTech | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 167–172
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.dravidianlangtech-1.28/
- DOI:
- Cite (ACL):
- Minhaz Chowdhury, Arnab Laskar, Taj Ahmad, and Azmine Toushik Wasi. 2025. MysticCIOL@DravidianLangTech 2025: A Hybrid Framework for Sentiment Analysis in Tamil and Tulu Using Fine-Tuned SBERT Embeddings and Custom MLP Architectures. In Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 167–172, Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- MysticCIOL@DravidianLangTech 2025: A Hybrid Framework for Sentiment Analysis in Tamil and Tulu Using Fine-Tuned SBERT Embeddings and Custom MLP Architectures (Chowdhury et al., DravidianLangTech 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.dravidianlangtech-1.28.pdf