SERENE@DravidianLangTech 2026: Multimodal Approaches for Depression Detection in Dravidian Speech: Acoustic, Spectrogram, and Transformer-Based Models

TT Pranesh, K.K.Thamizhmathi, S Vigneshwaran, Bharathi B


Abstract
This paper presents our submission to the De-pression Detection in Dravidian Languagesshared task at DravidianLangTech 2026. Weinvestigate three complementary approachesfor speech-based depression detection in Tamiland Malayalam: (i) acoustic feature engineer-ing using MFCC and prosodic features with aSupport Vector Machine (SVM) classifier, (ii)a convolutional neural network (CNN) trainedon Mel-spectrogram representations, and (iii)a transformer-based model using Whisper-generated transcripts fine-tuned with XLM-RoBERTa. Experimental results show thatacoustic feature-based SVM and spectrogram-based CNN models achieve the strongestperformance on both Tamil and Malayalamdatasets, while the transformer-based approachalso produces competitive results. We furtherdiscuss limitations and future research direc-tions.
Anthology ID:
2026.dravidianlangtech-1.55
Volume:
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:
July
Year:
2026
Address:
Underline (Virtual)
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Saranya Rajiakodi, Subalalitha Navaneethakrishnan, Dhivya Chinnappa, Balasubramanian Palani, Malliga Subramanian, Kogilavani Shanmugavadivel, Ratnavel Rajalakshmi
Venues:
DravidianLangTech | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
354–358
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.dravidianlangtech-1.55/
DOI:
Bibkey:
Cite (ACL):
TT Pranesh, K.K.Thamizhmathi, S Vigneshwaran, and Bharathi B. 2026. SERENE@DravidianLangTech 2026: Multimodal Approaches for Depression Detection in Dravidian Speech: Acoustic, Spectrogram, and Transformer-Based Models. In Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 354–358, Underline (Virtual). Association for Computational Linguistics.
Cite (Informal):
SERENE@DravidianLangTech 2026: Multimodal Approaches for Depression Detection in Dravidian Speech: Acoustic, Spectrogram, and Transformer-Based Models (Pranesh et al., DravidianLangTech 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.dravidianlangtech-1.55.pdf