Ratnajit Dhar


2026

The rapid growth of social media has gone hand in hand with a sharp increase in heated public discussions, where debates about elections, conflicts, protests, and identity often turn into divisive and polarized rhetoric. In this paper, we present our system for SemEval 2026 Task 9 – Subtask 1: Multilingual Text Classification Challenge-Polarization Detection, focusing specifically on the Bengali language. The task is a binary classification problem aimed at determining whether a social media post exhibits attitude polarization, such as intolerance, dehumanization, deindividuation, vilification, or stereotyping toward others’ opinions, identities, or beliefs. Among 49 participating teams, our approach ranked 2nd, achieving a macro-F1 score of 0.8582. We experimented with both transformer-based models and large language models (LLMs), and observed that LoRA-based instruction fine-tuned LLM-based approaches delivered the strongest performance in detecting nuanced and context-dependent polarization in Bengali text.
Tamil has a lot of internal variability, including the way it is used in casual conversations, code mixing, and phonetic differences in the way it is spoken in different regions, making it quite difficult to transcribe the spoken word and classify the dialects. In order to address these challenges, our paper presents the system developed by the CUET_InferX team for the Shared Task on Dialect Based Speech Recognition and Classification in Tamil, which was part of DravidianLangTech@ACL 2026. For Subtask 2 (ASR), our proposed system is based on a dual-architecture design that incorporates a fine-tuned Whisper-large-v3 model with Low-Rank Adaptation (LoRA) and a Wav2Vec2 XLSR-53 model, topped with a KenLM statistical language model for n-gram phonetic correction. Our ASR system resulted in a Word Error Rate (WER) of 0.54, which earned us 2nd position for Subtask 2. For Subtask 1 (Speech-Based Dialect Classification), our proposed system is based on a text-based weighted ensemble of IndicBERT, MuRIL, XLM-RoBERTa, and TamilBERT models, which is completely dependent on our ASR system’s transcription outputs. Our proposed system achieved a Macro F1 score of 0.22, which earned us 9th position for Subtask 1.

2025

Misogynous content on social media, especially in memes, present challenges due to the complex reciprocation of text and images that carry offensive messages. This difficulty mostly arises from the lack of direct alignment between modalities and biases in large-scale visio-linguistic models. In this paper, we present our system for the Shared Task on Misogyny Meme Detection - DravidianLangTech@NAACL 2025. We have implemented various unimodal models, such as mBERT and IndicBERT for text data, and ViT, ResNet, and EfficientNet for image data. Moreover, we have tried combining these models and finally adopted a multimodal approach that combined mBERT for text and EfficientNet for image features, both fine-tuned to better interpret subtle language and detailed visuals. The fused features are processed through a dense neural network for classification. Our approach achieved an F1 score of 0.78120, securing 4th place and demonstrating the potential of transformer-based architectures and state-of-the-art CNNs for this task.