Ashwag Alasmari
2026
Alexandria: A Multi-Domain Dialectal Arabic Machine Translation Dataset for Culturally Inclusive and Linguistically Diverse LLMs
Abdellah EL Mekki | Samar M. Magdy | Houdaifa Atou | Ruwa AbuHweidi | Baraah Qawasmeh | Omer Nacar | Thikra Al-hibiri | Razan Saadie | Hamzah A. Alsayadi | Nadia Ghezaiel Hammouda | Alshima Mohammed Alkhazimi | Aya Hamod | Al-Yas Yaqoob Al-Ghafri | Wesam El-Sayed | Asila Ismail al Sharji | Mohamad Ballout | Anas Belfathi | Karim Ghaddar | Serry Sibaee | Alaa Aoun | Aeej Mohammed Aseri | Lina Abureesh | Ahlam Bashiti | Majdal Yousef | Abdulaziz Hafiz | Yehdih Mohamed | Emira Hamedtou | Brakehe Emehah | Rahaf Alhamouri | Youssef Nafea | Aya El Aatar | Walid Al-Dhabyani | Emhemed S. Hamed | Sara Shatnawi | Fakhraddin Alwajih | Khalid Elkhidir | Ashwag Alasmari | Abdurrahman Gerrio | Omar Said Alshahri | AbdelRahim A. Elmadany | Ismail Berrada | Amir Azad Adli Al-kathiri | Fadi Zaraket | Mustafa Jarrar | Yahya Mohamed EL Hadj | Hassan Alhuzali | Muhammad Abdul-Mageed
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Abdellah EL Mekki | Samar M. Magdy | Houdaifa Atou | Ruwa AbuHweidi | Baraah Qawasmeh | Omer Nacar | Thikra Al-hibiri | Razan Saadie | Hamzah A. Alsayadi | Nadia Ghezaiel Hammouda | Alshima Mohammed Alkhazimi | Aya Hamod | Al-Yas Yaqoob Al-Ghafri | Wesam El-Sayed | Asila Ismail al Sharji | Mohamad Ballout | Anas Belfathi | Karim Ghaddar | Serry Sibaee | Alaa Aoun | Aeej Mohammed Aseri | Lina Abureesh | Ahlam Bashiti | Majdal Yousef | Abdulaziz Hafiz | Yehdih Mohamed | Emira Hamedtou | Brakehe Emehah | Rahaf Alhamouri | Youssef Nafea | Aya El Aatar | Walid Al-Dhabyani | Emhemed S. Hamed | Sara Shatnawi | Fakhraddin Alwajih | Khalid Elkhidir | Ashwag Alasmari | Abdurrahman Gerrio | Omar Said Alshahri | AbdelRahim A. Elmadany | Ismail Berrada | Amir Azad Adli Al-kathiri | Fadi Zaraket | Mustafa Jarrar | Yahya Mohamed EL Hadj | Hassan Alhuzali | Muhammad Abdul-Mageed
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Arabic is a highly diglossic language where most daily communication occurs in regional dialects rather than Modern Standard Arabic (MSA). Despite this, machine translation (MT) systems often generalize poorly to dialectal input, limiting their utility for millions of speakers. We introduce Alexandria, a large-scale, community-driven, human-translated dataset designed to bridge this gap. Alexandria covers 13 Arab countries and 11 high-impact domains, including health, education, and agriculture. Unlike previous resources, Alexandria provides unprecedented granularity by associating contributions with city-of-origin metadata, capturing authentic local varieties beyond coarse regional labels. The dataset consists of parallel English-Dialectal Arabic multi-turn conversational scenarios annotated with speaker-addressee gender configurations, enabling the study of gender-conditioned variation in dialectal use. Comprising 107K total turns, Alexandria serves as both a training resource and as a rigorous benchmark for evaluating MT and Large Language Models (LLMs). Our automatic and human evaluation benchmarks the current capabilities of Arabic-aware LLMs in translating across diverse Arabic dialects and sub-dialects while exposing significant persistent challenges.The Alexandria dataset, the creation prompts, the translation and revision guidelines, and the evaluation code are publicly available in the following repository: https://github.com/UBC-NLP/Alexandria
2025
AraHealthQA 2025: The First Shared Task on Arabic Health Question Answering
Hassan Alhuzali | Walid Al-Eisawi | Muhammad Abdul-Mageed | Chaimae Abouzahir | Mouath Abu-Daoud | Ashwag Alasmari | Renad Al-Monef | Ali Alqahtani | Lama Ayash | Leen Kharouf | Farah E. Shamout | Nizar Habash
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks
Hassan Alhuzali | Walid Al-Eisawi | Muhammad Abdul-Mageed | Chaimae Abouzahir | Mouath Abu-Daoud | Ashwag Alasmari | Renad Al-Monef | Ali Alqahtani | Lama Ayash | Leen Kharouf | Farah E. Shamout | Nizar Habash
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks
2024
AraMed: Arabic Medical Question Answering using Pretrained Transformer Language Models
Ashwag Alasmari | Sarah Alhumoud | Waad Alshammari
Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024
Ashwag Alasmari | Sarah Alhumoud | Waad Alshammari
Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024
Medical Question Answering systems have gained significant attention in recent years due to their potential to enhance medical decision-making and improve patient care. However, most of the research in this field has focused on English-language datasets, limiting the generalizability of MQA systems to non-English speaking regions. This study introduces AraMed, a large-scale Arabic Medical Question Answering dataset addressing the limited resources available for Arabic medical question answering. AraMed comprises of 270k question-answer pairs based on health consumer questions submitted to online medical forum. Experiments using various deep learning models showcase the dataset’s effectiveness, particularly with AraBERT models achieving highest results, specifically AraBERTv2 obtained an F1 score of 96.73% in the answer selection task. The comparative analysis of different deep learning models provides insights into their strengths and limitations. These findings highlight the potential of AraMed for advancing Arabic medical question answering research and development.
Search
Fix author
Co-authors
- Muhammad Abdul-Mageed 2
- Hassan Alhuzali 2
- Chaimae Abouzahir 1
- Mouath Abu-Daoud 1
- Ruwa AbuHweidi 1
- Lina Abureesh 1
- Walid Al-Dhabyani 1
- Walid Al-Eisawi 1
- Al-Yas Yaqoob Al-Ghafri 1
- Renad Al-Monef 1
- Thikra Al-hibiri 1
- Amir Azad Adli Al-kathiri 1
- Rahaf Alhamouri 1
- Sarah Alhumoud 1
- Alshima Mohammed Alkhazimi 1
- Ali Alqahtani 1
- Hamzah A. Alsayadi 1
- Omar Said Alshahri 1
- Waad Thuwaini Alshammari 1
- Fakhraddin Alwajih 1
- Alaa Aoun 1
- Aeej Mohammed Aseri 1
- Houdaifa Atou 1
- Lama Ayash 1
- Mohamad Ballout 1
- Ahlam Bashiti 1
- Anas Belfathi 1
- Ismail Berrada 1
- Yahya Mohamed EL Hadj 1
- Abdellah El Mekki 1
- Aya El aatar 1
- Wesam El-Sayed 1
- Khalid Elkhidir 1
- AbdelRahim A. Elmadany 1
- Brakehe Emehah 1
- Abdurrahman Gerrio 1
- Karim Ghaddar 1
- Nizar Habash 1
- Abdulaziz Hafiz 1
- Emhemed S. Hamed 1
- Emira Hamedtou 1
- Nadia Ghezaiel Hammouda 1
- Aya Hamod 1
- Mustafa Jarrar 1
- Leen Kharouf 1
- Samar Mohamed Magdy 1
- Yehdih Mohamed 1
- Omer Nacar 1
- Youssef Nafea 1
- Baraah Qawasmeh 1
- Razan Saadie 1
- Farah E. Shamout 1
- Sara Shatnawi 1
- Serry Sibaee 1
- Majdal Yousef 1
- Fadi A. Zaraket 1
- Asila Ismail al Sharji 1