Mehedi Hasan


2025

pdf bib
TeamHateMate at BLP Task1: Divide and Conquer: A Two-Stage Cascaded Framework with K-Fold Ensembling for Multi-Label Bangla Hate Speech Classification
Mahbub Islam Mahim | Mehedi Hasan
Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025)

Detecting hate speech on social media is essential for safeguarding online communities, yet it remains challenging for low-resource languages like Bangla due to class imbalance and subjective annotations. We introduce a two-stage cascaded framework with k-fold ensembling to address the BLP Workshop 2025 Shared Task’s three subtasks: 1A (hate type classification), 1B (target identification), and 1C (joint classification of type, target, and severity). Our solution balances precision and recall, achieving micro-F1 scores of 0.7331 on 1A, 0.7356 on 1B, and 0.7392 on 1C, ranking 4th on 1A and 1st on both 1B and 1C. It performs strongly on major classes, although underrepresented labels such as sexism and mild severity remain challenging. Our method makes the optimal use of limited data through k-fold ensembling and delivers overall balanced performance across majority and minority classes by mitigating class imbalance via cascaded layers.

2023

pdf bib
BanglaBait: Semi-Supervised Adversarial Approach for Clickbait Detection on Bangla Clickbait Dataset
Md. Motahar Mahtab | Monirul Haque | Mehedi Hasan | Farig Sadeque
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Intentionally luring readers to click on a particular content by exploiting their curiosity defines a title as clickbait. Although several studies focused on detecting clickbait titles in English articles, low-resource language like Bangla has not been given adequate attention. To tackle clickbait titles in Bangla, we have constructed the first Bangla clickbait detection dataset containing 15,056 labeled news articles and 65,406 unlabelled news articles extracted from clickbait-dense news sites. Each article has been labeled by three expert linguists and includes an article’s title, body, and other metadata. By incorporating labeled and unlabelled data, we finetune a pre-trained Bangla transformer model in an adversarial fashion using Semi-Supervised Generative Adversarial Networks (SS-GANs). The proposed model acts as a good baseline for this dataset, outperforming traditional neural network models (LSTM, GRU, CNN) and linguistic feature-based models. We expect that this dataset and the detailed analysis and comparison of these clickbait detection models will provide a fundamental basis for future research into detecting clickbait titles in Bengali articles.