Momtazul Arefin Labib

2025

pdf bib abs
CUET-823@DravidianLangTech 2025: Shared Task on Multimodal Misogyny Meme Detection in Tamil Language
Arpita Mallik | Ratnajit Dhar | Udoy Das | Momtazul Arefin Labib | Samia Rahman | Hasan Murad
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Misogynous content on social media, especially in memes, present challenges due to the complex reciprocation of text and images that carry offensive messages. This difficulty mostly arises from the lack of direct alignment between modalities and biases in large-scale visio-linguistic models. In this paper, we present our system for the Shared Task on Misogyny Meme Detection - DravidianLangTech@NAACL 2025. We have implemented various unimodal models, such as mBERT and IndicBERT for text data, and ViT, ResNet, and EfficientNet for image data. Moreover, we have tried combining these models and finally adopted a multimodal approach that combined mBERT for text and EfficientNet for image features, both fine-tuned to better interpret subtle language and detailed visuals. The fused features are processed through a dense neural network for classification. Our approach achieved an F1 score of 0.78120, securing 4th place and demonstrating the potential of transformer-based architectures and state-of-the-art CNNs for this task.

pdf bib abs
Team ML_Forge@DravidianLangTech 2025: Multimodal Hate Speech Detection in Dravidian Languages
Adnan Faisal | Shiti Chowdhury | Sajib Bhattacharjee | Udoy Das | Samia Rahman | Momtazul Arefin Labib | Hasan Murad
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Ensuring a safe and inclusive online environment requires effective hate speech detection on social media. While detection systems have significantly advanced for English, many regional languages, including Malayalam, Tamil and Telugu, remain underrepresented, creating challenges in identifying harmful content accurately. These languages present unique challenges due to their complex grammar, diverse dialects, and frequent code-mixing with English. The rise of multimodal content, including text and audio, adds further complexity to detection tasks. The shared task “Multimodal Hate Speech Detection in Dravidian Languages: DravidianLangTech@NAACL 2025” has aimed to address these challenges. A Youtube-sourced dataset has been provided, labeled into five categories: Gender (G), Political (P), Religious (R), Personal Defamation (C) and Non-Hate (NH). In our approach, we have used mBERT, T5 for text and Wav2Vec2 and Whisper for audio. T5 has performed poorly compared to mBERT, which has achieved the highest F1 scores on the test dataset. For audio, Wav2Vec2 has been chosen over Whisper because it processes raw audio effectively using self-supervised learning. In the hate speech detection task, we have achieved a macro F1 score of 0.2005 for Malayalam, ranking 15th in this task, 0.1356 for Tamil and 0.1465 for Telugu, with both ranking 16th in this task.

pdf bib abs
CUET_Absolute_Zero@DravidianLangTech 2025: Detecting AI-Generated Product Reviews in Malayalam and Tamil Language Using Transformer Models
Anindo Barua | Sidratul Muntaha | Momtazul Arefin Labib | Samia Rahman | Udoy Das | Hasan Murad
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Artificial Intelligence (AI) is opening new doors of learning and interaction. However, it has its share of problems. One major issue is the ability of AI to generate text that resembles human-written text. So, how can we tell apart human-written text from AI-generated text?With this in mind, we have worked on detecting AI-generated product reviews in Dravidian languages, mainly in Malayalam and Tamil. The “Shared Task on Detecting AI-Generated Product Reviews in Dravidian Languages,” held as part of the DravidianLangTech Workshop at NAACL 2025 has provided a dataset categorized into two categories, human-written review and AI-generated review. We have implemented four machine learning models (Random Forest, Support Vector Machine, Decision Tree, and XGBoost), four deep learning models (Long Short-Term Memory, Bidirectional Long Short-Term Memory, Gated Recurrent Unit, and Recurrent Neural Network), and three transformer-based models (AI-Human-Detector, Detect-AI-Text, and E5-Small-Lora-AI-Generated-Detector). We have conducted a comparative study among all the models by training and evaluating each model on the dataset. We have discovered that the transformer, E5-Small-Lora-AI-Generated-Detector, has provided the best result with an F1 score of 0.8994 on the test set ranking 7th position in the Malayalam language. Tamil has a higher token overlap and richer morphology than Malayalam. Thus, we obtained a worse F1 score of 0.5877 ranking 28th position in the Tamil language among all participants in the shared task.

2024

pdf bib abs
CUET_SSTM at the GEM’24 Summarization Task: Integration of extractive and abstractive method for long text summarization in Swahili language
Samia Rahman | Momtazul Arefin Labib | Hasan Murad | Udoy Das
Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges

Swahili, spoken by around 200 million people primarily in Tanzania and Kenya, has been the focus of our research for the GEM Shared Task at INLG’24 on Underrepresented Language Summarization. We have utilized the XLSUM dataset and have manually summarized 1000 texts from a Swahili news classification dataset. To achieve the desired results, we have tested abstractive summarizers (mT5_multilingual_XLSum, t5-small, mBART-50), and an extractive summarizer (based on PageRank algorithm). But our adopted model consists of an integrated extractive-abstractive model combining the Bert Extractive Summarizer with some abstractive summarizers (t5-small, mBART-50). The integrated model overcome the drawbacks of both the extractive and abstractive summarization system and utilizes the benefit from both of it. Extractive summarizer shorten the paragraphs exceeding 512 tokens, ensuring no important information has been lost before applying the abstractive models. The abstractive summarizer use its pretrained knowledge and ensure to generate context based summary.

Co-authors

Venues

Fix data