Mabrouka Bessghaier

2026

From Posts to Pressure: An Arabic Dataset about Stress and Mental-Health Monitoring
Wajdi Zaghouani | Eman Sedqy Shlkamy | Mabrouka Bessghaier
Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script

How do Arabic-speaking communities express and engage with psychological stress on social media? We introduce AraStress, the first large-scale Arabic corpus dedicated to psychological stress research, comprising 175,862 public social media posts from 2020 to 2024, covering pandemic and post-pandemic periods.It fills a significant gap in Arabic mental-health NLP resources focused on stress, enabling large-scale analysis of related expressions.Unlike prior work focusing primarily on Twitter and depression or suicidality, AraStress addresses the critical gap in stress-focused resources. Our lexicon-based analysis reveals that stress-related posts elicit predominantly affective engagement and exhibit a hybrid lexical framing that integrates religious and therapeutic language. AraStress provides a foundational resource for culturally grounded computational models of stress detection and digital wellbeing in Arabic-speaking communities.

pdf bib abs

A Multi-Task Learning Framework for Modeling Engagement and Topic-Sensitive Responses in Arabic Women’s Discourse
Mabrouka Bessghaier | Md. Rafiul Biswas | Shimaa Ibrahim | Wajdi Zaghouani
Findings of the Association for Computational Linguistics: EACL 2026

Predicting how audiences react to Arabic social media posts requires reasoning beyond textual sentiment: reactions emerge from collective interpretation moderated by engagement dynamics and topical context. We present a multi-task learning (MTL) framework that jointly learns (i) audience reaction classification (Love, Haha, Angry, Sad, Care, Wow), (ii) engagement magnitude regression (six reactions, comments, shares), and (iii) non-engagement detection. On a corpus of 158k Arabic Facebook posts spanning women’s rights, gender debates, and economic empowerment, our model achieves a test macro-F1 of 72.4 and weighted-F1 of 89.1.

2025

pdf bib

MarsadLab at AraGenEval Shared Task: LLM-Based Approaches to Arabic Authorship Style Transfer and Identification
Md. Rafiul Biswas | Mabrouka Bessghaier | Firoj Alam | Wajdi Zaghouani
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib

MarsadLab at AraHealthQA: Hybrid Contextual–Lexical Fusion with AraBERT for Question and Answer Categorization
Mabrouka Bessghaier | Shimaa Ibrahim | Md. Rafiul Biswas | Wajdi Zaghouani
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib

MarsadLab at BAREC Shared Task 2025: Strict-Track Readability Prediction with Specialized AraBERT Models on BAREC
Shimaa Ibrahim | Md. Rafiul Biswas | Mabrouka Bessghaier | Wajdi Zaghouani
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib abs

This paper presents the MAHED 2025 Shared Task on Multimodal Detection of Hope and Hate Emotions in Arabic Content, comprising three subtasks: (1) text-based classification of Arabic content into hate and hope, (2) multi-task learning for joint prediction of emotions, offensive content, and hate speech, and (3) multimodal detection of hateful content in Arabic memes. We provide three high-quality datasets totaling over 22,000 instances sourced from social media platforms, annotated by native Arabic speakers with Cohen’s Kappa exceeding 0.85. Our evaluation attracted 46 leaderboard submissions from participants, with systems leveraging Arabic-specific pre-trained language models (AraBERT, MARBERT), large language models (GPT-4, Gemini), and multimodal fusion architectures combining CLIP vision encoders with Arabic text models. The best-performing systems achieved macro F1-scores of 0.723 (Task 1), 0.578 (Task 2), and 0.796 (Task 3), with top teams employing ensemble methods, class-weighted training, and OCR-aware multimodal fusion. Analysis reveals persistent challenges in dialectal robustness, minority class detection for hope speech, and highlights key directions for future Arabic content moderation research.

pdf bib

MarsadLab at NADI Shared Task: Arabic Dialect Identification and Speech Recognition using ECAPA-TDNN and Whisper
Md. Rafiul Biswas | Kais Attia | Shimaa Ibrahim | Mabrouka Bessghaier | Wajdi Zaghouani
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib

MarsadLab at TAQEEM 2025: Prompt-Aware Lexicon-Enhanced Transformer for Arabic Automated Essay Scoring
Mabrouka Bessghaier | Md. Rafiul Biswas | Amira Dhouib | Wajdi Zaghouani
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib abs

Evaluation of Pretrained and Instruction-Based Pretrained Models for Emotion Detection in Arabic Social Media Text
Md. Rafiul Biswas | Shimaa Ibrahim | Mabrouka Bessghaier | Wajdi Zaghouani
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

This study evaluates three approaches—instruction prompting of large language models (LLMs), instruction fine-tuning of LLMs, and transformer-based pretrained models on emotion detection in Arabic social media text. We compare pretrained transformer models like AraBERT, CaMelBERT, and XLM-RoBERTa with instruction prompting with advanced LLMs like GPT-4o, Gemini, Deepseek, and Fanar, and instruction fine-tuning approaches with LLMs like Llama 3.1, Mistral, and Phi. With a highly preprocessed dataset of 10,000 labeled Arabic tweets with overlapping emotional labels, our findings reveal that transformer-based pretrained models outperform instruction prompting and instruction fine-tuning approaches. Instruction prompts leverage general linguistic skills with maximum efficiency but fall short in detecting subtle emotional contexts. Instruction fine-tuning is more specific but trails behind pretrained transformer models. Our findings establish the need for optimized instruction-based approaches and underscore the important role played by domain-specific transformer architectures in accurate Arabic emotion detection.

pdf bib abs

Ahasis Shared Task: Hybrid Lexicon-Augmented AraBERT Model for Sentiment Detection in Arabic Dialects
Shimaa Amer Ibrahim | Mabrouka Bessghaier | Wajdi Zaghouani
Proceedings of the Shared Task on Sentiment Analysis for Arabic Dialects

This work was conducted as part of the Ahasis@RANLP–2025 shared task, which focuses on sentiment detection in Arabic dialects within the hotel review domain. The primary objective is to advance sentiment analysis methodologies tailored to dialectal Arabic. Our work combines data augmentation with a hybrid model that integrates AraBERT and our created sentiment lexicon. Notably, our hybrid model significantly improved performance, reaching an F1-score of 0.74, compared to 0.56 when using only AraBERT. These results highlight the effectiveness of lexicon integration and augmentation strategies in enhancing both the accuracy and robustness of sentiment classification in dialectal Arabic.

2024

pdf bib abs

MARASTA: A Multi-dialectal Arabic Cross-domain Stance Corpus
Anis Charfi | Mabrouka Bessghaier | Andria Atalla | Raghda Akasheh | Sara Al-Emadi | Wajdi Zaghouani
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This paper introduces a cross-domain and multi-dialectal stance corpus for Arabic that includes four regions in the Arab World and covers the main Arabic dialect groups. Our corpus consists of 4657 sentences manually annotated with each sentence’s stance towards a specific topic. For each region, we collected sentences related to two controversial topics. We annotated each sentence by at least two annotators to indicate if its stance favors the topic, is against it, or is neutral. Our corpus is well-balanced concerning dialect and stance. Approximately half of the sentences are in Modern Standard Arabic (MSA) for each region, and the other half is in the region’s respective dialect. We conducted several machine-learning experiments for stance detection using our new corpus. Our most successful model is the Multi-Layer Perceptron (MLP), using Unigram or TF-IDF extracted features, which yielded an F1-score of 0.66 and an accuracy score of 0.66. Compared with the most similar state-of-the-art dataset, our dataset outperformed in specific stance classes, particularly “neutral” and “against”.

Co-authors

Shimaa Amer Ibrahim 1

George Mikros 1

Eman Sedqy Shlkamy 1

Venues

Findings1

LREC1

Fix author