Mahmoud Reda


2025

pdf bib
ANLPers at AraGenEval Shared Task: Descriptive Author Tokens for Transparent Arabic Authorship Style Transfer
Omer Nacar | Mahmoud Reda | Serry Sibaee | Yasser Alhabashi | Adel Ammar | Wadii Boulila
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib
ANLPers at QIAS: CoT for Islamic Inheritance
Serry Sibaee | Mahmoud Reda | Omer Nacar | Yasser Alhabashi | Adel Ammar | Wadii Boulila
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

2024

pdf bib
Arabic Diacritization Using Morphologically Informed Character-Level Model
Muhammad Morsy Elmallah | Mahmoud Reda | Kareem Darwish | Abdelrahman El-Sheikh | Ashraf Hatim Elneima | Murtadha Aljubran | Nouf Alsaeed | Reem Mohammed | Mohamed Al-Badrashiny
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Arabic diacritic recovery i.e. diacritization is necessary for proper vocalization and an enabler for downstream applications such as language learning and text to speech. Diacritics come in two varieties, namely: core-word diacritics and case endings. In this paper we introduce a highly effective morphologically informed character-level model that can recover both types of diacritics simultaneously. The model uses a Recurrent Neural Network (RNN) based architecture that takes in text as a sequence of characters, with markers for morphological segmentation, and outputs a sequence of diacritics. We also introduce a character-based morphological segmentation model that we train for Modern Standard Arabic (MSA) and dialectal Arabic. We demonstrate the efficacy of our diacritization model on Classical Arabic, MSA, and two dialectal (Moroccan and Tunisian) texts. We achieve the lowest reported word-level diacritization error rate for MSA (3.4%), match the best results for Classical Arabic (5.4%), and report competitive results for dialectal Arabic.