Manohar Sita Rama Madhurapantula

2026

Mano_sub@DravidianLangTech 2026: Article-Aware Batching and Discriminative Fine-Tuning of MuRIL for Telugu Prompt-Style Classification
Manohar Sita Rama Madhurapantula | Seshu Babu Pulagara
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

This paper presents Team Mano_sub’s sub mission to the Telugu Prompt-Style Recovery task at DravidianLangTech 2026, classifying Telugu text into nine stylistic categories: Formal, Informal, Optimistic, Pessimistic, Humorous, Serious, Inspiring, Authoritative, and Persuasive. We identify a critical structural property of the dataset: each of 384 unique source articles appears ap proximately 7.8 times with different style la bels. Standard random batching leads to poor within-batch diversity when same-article samples co-occur, causing majority-class collapse and keeping macro F1 stuck at 0.022 regard less of learning rate. We propose an article aware batch sampler that enforces within-batch article diversity, combined with discriminative learning rates for full MuRIL fine-tuning. Complete five-fold cross-validation yields a mean macro F1 of 0.3834 (std=0.0189) on the development set, with fold best scores ranging from 0.3488 to 0.4040. The fold 1 best model achieves macro F1=0.2765 on the official test set —a5.6×improvement over our officially submitted result of F1=0.0491, which would have ranked 2nd among all 13 participating teams. All nine style classes are correctly predicted by epoch 5. Our system is officially ranked 12th in the Prompt Recovery for LLM in Telugu shared task at DravidianLangTech@ACL 2026. Code: https:// github.com/msrmanohar/ACL-PRLLM

Co-authors

Seshu Babu Pulagara 1

Venues

Fix author