Wise@DravidianLangTech 2026: Dialect-Aware Tamil Speech Classification and Recognition via Cross-Pipeline Embedding Transfer

Ganesh Sundhar S, Hari Krishnan N, Gnanasabesan G, Suriya KP, Jyothish Lal G


Abstract
This paper presents the **Wise** system for the shared task on dialect-based speech processing in Tamil, addressing two subtasks: **(1) four-way dialect region classification** (Northern, Southern, Western, Central), and **(2) dialectal Tamil ASR**. All audio is preprocessed using loudness normalization followed by neural denoising to ensure consistent audio quality for downstream models. For classification, we experiment with different model variants combining multilingual and Tamil-pretrained **Wav2Vec2** backbones with five temporal pooling strategies under frozen and partial fine-tuning settings. Our best configuration, i.e., learned attentive pooling with partial fine-tuning and a differentially trained MLP head, achieves a macro F1 of **0.79**, securing **1st place** with a margin of **0.26** points. For ASR, we propose two novel **dialect-conditioned Whisper** architectures—residual injection and cross-attention—that inject dialect embeddings from the trained classifier into the ASR pipeline. In addition, we evaluate a vanilla Whisper-Tamil fine-tuned baseline. The best model achieved a **WER of 0.90**, securing **8th place** in the shared task.
Anthology ID:
2026.dravidianlangtech-1.71
Volume:
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:
July
Year:
2026
Address:
Underline (Virtual)
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Saranya Rajiakodi, Subalalitha Navaneethakrishnan, Dhivya Chinnappa, Balasubramanian Palani, Malliga Subramanian, Kogilavani Shanmugavadivel, Ratnavel Rajalakshmi
Venues:
DravidianLangTech | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
447–452
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.dravidianlangtech-1.71/
DOI:
Bibkey:
Cite (ACL):
Ganesh Sundhar S, Hari Krishnan N, Gnanasabesan G, Suriya KP, and Jyothish Lal G. 2026. Wise@DravidianLangTech 2026: Dialect-Aware Tamil Speech Classification and Recognition via Cross-Pipeline Embedding Transfer. In Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 447–452, Underline (Virtual). Association for Computational Linguistics.
Cite (Informal):
Wise@DravidianLangTech 2026: Dialect-Aware Tamil Speech Classification and Recognition via Cross-Pipeline Embedding Transfer (S et al., DravidianLangTech 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.dravidianlangtech-1.71.pdf