S.b.priya

2026

RspectNLP@LT-EDI 2026:Rubric-Driven Prompting for Safe Multilingual Counter Narrative Generation
S.b.priya | Bharathi B
Proceedings of the Sixth Workshop on Language Technology for Equality, Diversity, Inclusion

The problem of harmful online discourse against the LGBTQ+ community is still a concern on social media platforms. Although hate speech detection is a well-explored area, the task of constructive counter-narrative generation is still an emerging field of research, especially in the multilingual and low-resource settings. Counter-narratives are designed to counter harmful discourse with respectful and empathetic responses, as opposed to mere content deletion. In this paper, the model proposes a zero-shot multilingual system for counter-narrative generation in English and Tamil. The proposed system employs the pretrained google/flan-t5-base transformer model guided by rubric-aligned prompts to encourage politeness, contextual relevance, and non-toxic response generation. The system operates in a zero-shot setting without task-specific fine-tuning and uses beam search decoding for controlled response generation. On the English test data, the system scored an overall score of 70.33 per cent with a contextual coherence score of 81.82 per cent. On the Tamil test data, the system scored an overall score of 33.57 per cent with significantly lower scores on coherence and quality. These findings indicate that structured prompting can facilitate safe and coherent generation in English, but also underscore the challenges of zero-shot multilingual models in low-resource language scenarios.

pdf bib abs

TamilVoiceLab@DravidianLangTech 2026: Investigating Whisper Tamil Large-v2 for Dialectal Tamil Speech Recognition
S.b.priya | Bharathi B
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Automatic Speech Recognition (ASR) for languages rich in dialects and those with limited resources presents significant challenges due to the variations in pronunciation and vocabulary across different regions. This study offers a baseline evaluation of the Whisper Tamil Large-v2 model without fine-tuning for the Tamil Dialect Speech Recognition shared task. The focus is on the ASR subtask, utilizing dialectal Tamil speech recordings gathered from various regional dialects within Tamil Nadu. The pretrained Whisper Tamil Large-v2 model was assessed directly, without any supplementary fine-tuning or domain adaptation. A total of 579 dialect speech samples were used for experimentation, with performance evaluated based on Word Error Rate (WER). The model recorded a WER of 0.71, indicating that even robust multilingual pretrained models encounter challenges in dialect-rich and low-resource environments. These findings underscore the necessity for dialect-aware adaptation and the importance of balanced dialect training data to develop effective Tamil ASR systems.

Co-authors

Bharathi B 2

Venues

Fix author