Yousseif Ahmed Elshahawy


2025

pdf bib
Octopus: Towards Building the Arabic Speech LLM Suite
Sara Althubaiti | Vasista Sai Lodagala | Tjad Clark | Yousseif Ahmed Elshahawy | Daniel Izham | Abdullah Alrajeh | Aljawahrah Bin Tamran | Ahmed Ali
Proceedings of The Third Arabic Natural Language Processing Conference

We present Octopus, a first family of modular speech-language models designed for Arabic-English ASR, dialect identification, and speech translation. Built on Whisper-V3 and enhanced with large language models like ALLaM, LLaMA, and DeepSeek, Octopus bridges speech and text through a lightweight projection layer and Q-Former. To broaden its scope beyond speech, Octopus integrates BEATs, a general-purpose audio encoder allowing it to understand both linguistic and acoustic events. Despite its simplicity, this dual-encoder design supports robust performance across multilingual and code-switched scenarios. We also introduce TinyOctopus, a distilled variant using smaller models (Distil-Whisper + LLaMA3-1B / DeepSeek-1.5B), achieving competitive results with just a fraction of the parameters. Fine-tuning on synthetic code-switched data further boosts its performance. Octopus demonstrates the power of compact, extensible architectures in Arabic-centric speech modeling and sets the stage for unified multilingual audio-language understanding.

pdf bib
Iqra’Eval: A Shared Task on Qur’anic Pronunciation Assessment
Yassine El Kheir | Amit Meghanani | Hawau Olamide Toyin | Nada Almarwani | Omnia Ibrahim | Yousseif Ahmed Elshahawy | Mostafa Shahin | Ahmed Ali
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

We present the findings of the first shared task on Qur’anic pronunciation assessment, which focuses on addressing the unique challenges of evaluating the precise pronunciation of Qur’anic recitation. To fill an existing research gap, the Iqra’Eval 2025 shared task introduces the first open benchmark for Mispronunciation Detection and Diagnosis (MDD) in Qur’anic recitation, using Modern Standard Arabic (MSA) reading of Qur’anic texts as its case study. The task provides a comprehensive evaluation framework with increasingly complex subtasks: error localization and detailed error diagnosis. Leveraging the recently developed QuranMB benchmark dataset along with auxiliary training resources, this shared task aims to stimulate research in an area of both linguistic and cultural significance while addressing computational challenges in pronunciation assessment.