Sharun Khushbu


2023

pdf
Team Error Point at BLP-2023 Task 1: A Comprehensive Approach for Violence Inciting Text Detection using Deep Learning and Traditional Machine Learning Algorithm
Rajesh Das | Jannatul Maowa | Moshfiqur Ajmain | Kabid Yeiad | Mirajul Islam | Sharun Khushbu
Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)

In the modern digital landscape, social media platforms have the dual role of fostering unprecedented connectivity and harboring a dark underbelly in the form of widespread violence-inciting content. Pioneering research in Bengali social media aims to provide a groundbreaking solution to this issue. This study thoroughly investigates violence-inciting text classification using a diverse range of machine learning and deep learning models, offering insights into content moderation and strategies for enhancing online safety. Situated at the intersection of technology and social responsibility, the aim is to empower platforms and communities to combat online violence. By providing insights into model selection and methodology, this work makes a significant contribution to the ongoing dialogue about the challenges posed by the darker aspects of the digital era. Our system scored 31.913 and ranked 26 among the participants.

pdf
Ushoshi2023 at BLP-2023 Task 2: A Comparison of Traditional to Advanced Linguistic Models to Analyze Sentiment in Bangla Texts
Sharun Khushbu | Nasheen Nur | Mohiuddin Ahmed | Nashtarin Nur
Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)

This article describes our analytical approach designed for BLP Workshop-2023 Task-2: in Sentiment Analysis. During actual task submission, we used DistilBERT. However, we later applied rigorous hyperparameter tuning and pre-processing, improving the result to 68% accuracy and a 68% F1 micro score with vanilla LSTM. Traditional machine learning models were applied to compare the result where 75% accuracy was achieved with traditional SVM. Our contributions are a) data augmentation using the oversampling method to remove data imbalance and b) attention masking for data encoding with masked language modeling to capture representations of language semantics effectively, by further demonstrating it with explainable AI. Originally, our system scored 0.26 micro-F1 in the competition and ranked 30th among the participants for a basic DistilBERT model, which we later improved to 0.68 and 0.65 with LSTM and XLM-RoBERTa-base models, respectively.

pdf
Team Error Point at BLP-2023 Task 2: A Comparative Exploration of Hybrid Deep Learning and Machine Learning Approach for Advanced Sentiment Analysis Techniques.
Rajesh Das | Kabid Yeiad | Moshfiqur Ajmain | Jannatul Maowa | Mirajul Islam | Sharun Khushbu
Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)

This paper presents a thorough and extensive investigation into the diverse models and techniques utilized for sentiment analysis. What sets this research apart is the deliberate and purposeful incorporation of data augmentation techniques with the goal of improving the efficacy of sentiment analysis in the Bengali language. We systematically explore various approaches, including preprocessing techniques, advancedmodels like Long Short-Term Memory (LSTM) and LSTM-CNN (Convolutional Neural Network) Combine, and traditional machine learning models such as Logistic Regression, Decision Tree, Random Forest, Multi-Naive Bayes, Support Vector Machine, and Stochastic Gradient Descent. Our study highlights the substantial impact of data augmentation on enhancing model accuracy and understanding Bangla sentiment nuances. Additionally, we emphasize the LSTM model’s ability to capture long-range correlations in Bangla text. Our system scored 0.4129 and ranked 27th among the participants.