Jyothish Lal G

2026

Wise@DravidianLangTech 2026: Dialect-Aware Tamil Speech Classification and Recognition via Cross-Pipeline Embedding Transfer
Ganesh Sundhar S | Hari Krishnan N | Gnanasabesan G | Suriya KP | Jyothish Lal G
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

This paper presents the **Wise** system for the shared task on dialect-based speech processing in Tamil, addressing two subtasks: **(1) four-way dialect region classification** (Northern, Southern, Western, Central), and **(2) dialectal Tamil ASR**. All audio is preprocessed using loudness normalization followed by neural denoising to ensure consistent audio quality for downstream models. For classification, we experiment with different model variants combining multilingual and Tamil-pretrained **Wav2Vec2** backbones with five temporal pooling strategies under frozen and partial fine-tuning settings. Our best configuration, i.e., learned attentive pooling with partial fine-tuning and a differentially trained MLP head, achieves a macro F1 of **0.79**, securing **1st place** with a margin of **0.26** points. For ASR, we propose two novel **dialect-conditioned Whisper** architectures—residual injection and cross-attention—that inject dialect embeddings from the trained classifier into the ASR pipeline. In addition, we evaluate a vanilla Whisper-Tamil fine-tuned baseline. The best model achieved a **WER of 0.90**, securing **8th place** in the shared task.

pdf bib abs

SYNAPSE@DravidianLangTech 2026: Multi-Level Political Meme Classification for Tamil and Malayalam
Suriya KP | Durai Singh K | Gnanasabesan G | Ganesh Sundhar S | Hari Krishnan N | Jyothish Lal G
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Political memes in Tamil and Malayalampresent unique multimodal challenges for automated under-standing, combining visual context with code-mixed, cultur-ally grounded text. We present SYNAPSE, our system forthe DravidianLangTech@ACL 2026 shared task on multi-levelpolitical meme classification. The task requires hierarchicalclassification of memes along two levels: Level 1 identifies thepolitical stance (Support/Praise vs. Troll/Oppose), and Level 2identifies the target (individual person vs. party). Our approachfine-tunes the Qwen3-VL-2B-Instruct vision-language modelusing parameter-efficient LoRA adapters on task-specific mul-timodal data, with structured output prompting for hierarchi-cal label prediction. We report results for both Tamil andMalayalam subtracks. For Malayalam, our system achievesa Level 1 F1 of 0.9200 and Level 2 F1 of 0.4256 (Avg-F1:0.6728, Rank 5). For Tamil, our system achieves a Level 1 F1of 0.7840 and Level 2 F1 of 0.4885 (Avg-F1: 0.6362, Rank 14).

pdf bib abs

This paper presents an overview of the Shared Task on Prompt Recovery for Large Language Models (LLMs) in Telugu, organized as part of DravidianLangTech @ ACL 2026. The task focuses on identifying the underlying communicative style of Telugu text excerpts, framed as a nine-class single-label classification problem covering Formal, Informal, Optimistic, Pessimistic, Humorous, Serious, Inspiring, Authoritative, and Persuasive tones. The dataset was constructed by collecting Telugu YouTube comments and generating style-modified variants using an LLM, resulting in 3,000 training instances, 300 validation samples, and 301 test samples. A total of 52 teams registered for the shared task, with 13 teams submitting valid system predictions. Systems explored diverse approaches, including transformer-based fine-tuning (IndicBERT, MuRIL, XLM-R), ensemble and stacking methods, pairwise modeling strategies, curriculum learning, and few-shot large language model prompting. Evaluation was conducted using Macro F1-score as the primary metric. The top-performing system achieved a Macro F1-score of 0.2987. Overall results indicate that Telugu prompt-style recovery remains a challenging problem, particularly due to stylistic overlap and high lexical similarity across classes.

pdf bib abs

Shared Task on Depression Detection from Malayalam and Tamil Speech Data
Jyothish Lal G | Premjith B | Bharathi Raja Chakravarthi | Saranya Rajiakodi | Thenmozhi Durairaj | Prasanna Kumar Kumaresan
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Depression is one of the most common mental health problems in the world. It affects a person’s emotions, thinking, energy levels, and daily life. Early detection of depression is very important to provide timely support and treatment. While many studies focus on identifying depression from text, speech also carries important emotional and psychological signals that are often not fully explored. This paper presents an overview of the shared task on Depression Detection in Dravidian Languages (DD- DL). The task focuses on identifying signs of depression from speech data in two low-resource Dravidian languages: Tamil and Malayalam. Participants were provided with curated training datasets and were asked to build systems to classify speech samples as Depressed or Non-Depressed. The shared task includes two subtasks: (1) Depression detection in Tamil and (2) Depression detection in Malayalam. Participants applied various machine learning and deep learning approaches to model the acoustic and linguistic characteristics of speech. All submissions were evaluated using the macro-F1 score, which ensures fair performance measurement across classes.

2025

pdf bib abs

CrewX@LT-EDI-2025: Transformer-Based Tamil ASR Fine-Tuning with AVMD Denoising and GRU-VAD for Enhanced Transcription Accuracy
Ganesh Sundhar S | Hari Krishnan N | Arun Prasad T D | Shruthikaa V | Jyothish Lal G
Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion

This research presents an improved Tamil Automatic Speech Recognition (ASR) system designed to enhance accessibility for elderly and transgender populations by addressing unique language challenges. We address the challenges of Tamil ASR—including limited high-quality curated datasets, unique phonetic characteristics, and word-merging tendencies—through a comprehensive pipeline. Our methodology integrates Adaptive Variational Mode Decomposition (AVMD) for selective noise reduction based on signal characteristics, Silero Voice Activity Detection (VAD) with GRU architecture to eliminate non-speech segments, and fine-tuning of OpenAI’s Whisper model optimized for Tamil transcription. The system employs beam search decoding during inference to further improve accuracy. Our approach achieved state-of-the-art performance with a Word Error Rate (WER) of 31.9,winning first place in the LT-EDI 2025 shared task.

pdf bib abs

NSR_LT-EDI-2025 Automatic speech recognition in Tamil
Nishanth S | Shruthi Rengarajan | Burugu Rahul | Jyothish Lal G
Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion

Automatic Speech Recognition (ASR) technology can potentially make marginalized communities more accessible. However, older adultsand transgender speakers are usually highly disadvantaged in accessing valuable services due to low digital literacy and social biases. In Tamil-speaking regions, these are further compounded by the inability of ASR models to address their unique speech types, accents, and spontaneous speaking styles. To bridge this gap, the LT-EDI-2025 shared task is designed to develop robust ASR systems for Tamil speech from vulnerable populations. Using whisper based models, this task is designed to improve recognition rates in speech data collected from older adults and transgender speakers in naturalistic settings such as banks, hospitals and public offices. By bridging the linguistic heterogeneity and acoustic variability among this underrepresented population, the shared task is designed to develop inclusive AI solutions that break communication barriers and empower vulnerable populations in Tamil Nadu.

pdf bib abs

Overview of the Shared Task on Multimodal Hate Speech Detection in Dravidian languages: DravidianLangTech@NAACL 2025
Jyothish Lal G | Premjith B | Bharathi Raja Chakravarthi | Saranya Rajiakodi | Bharathi B | Rajeswari Natarajan | Ratnavel Rajalakshmi
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

The detection of hate speech in social media platforms is very crucial these days. This is due to its adverse impact on mental health, social harmony, and online safety. This paper presents the overview of the shared task on Multimodal Hate Speech Detection in Dravidian Languages organized as part of DravidianLangTech@NAACL 2025. The task emphasizes detecting hate speech in social media content that combines speech and text. Here, we focus on three low-resource Dravidian languages: Malayalam, Tamil, and Telugu. Participants were required to classify hate speech in three sub-tasks, each corresponding to one of these languages. The dataset was curated by collecting speech and corresponding text from YouTube videos. Various machine learning and deep learning-based models, including transformer-based architectures and multimodal frameworks, were employed by the participants. The submissions were evaluated using the macro F1 score. Experimental results underline the potential of multimodal approaches in advancing hate speech detection for low-resource languages. Team SSNTrio achieved the highest F1 score in Malayalam and Tamil of 0.7511 and 0.7332, respectively. Team lowes scored the best F1 score of 0.3817 in the Telugu sub-task.

2023

pdf bib abs

This paper summarizes the shared task on multimodal abusive language detection and sentiment analysis in Dravidian languages as part of the third Workshop on Speech and Language Technologies for Dravidian Languages at RANLP 2023. This shared task provides a platform for researchers worldwide to submit their models on two crucial social media data analysis problems in Dravidian languages - abusive language detection and sentiment analysis. Abusive language detection identifies social media content with abusive information, whereas sentiment analysis refers to the problem of determining the sentiments expressed in a text. This task aims to build models for detecting abusive content and analyzing fine-grained sentiment from multimodal data in Tamil and Malayalam. The multimodal data consists of three modalities - video, audio and text. The datasets for both tasks were prepared by collecting videos from YouTube. Sixty teams participated in both tasks. However, only two teams submitted their results. The submissions were evaluated using macro F1-score.

Jyothish Lal G

2026

2025

2023

Co-authors

Venues