Mohamed Anwar
2026
Cultural Benchmarking of LLMs in Standard and Dialectal Arabic Dialogues
Muhammad Dehan Al Kautsar | Saeed Almheiri | Momina Ahsan | Bilal Elbouardi | Younes Samih | Sarfraz Ahmad | Amr Keleg | Omar El Herraoui | Kareem Elzeky | Abed Alhakim Freihat | Mohamed Anwar | Zhuohan Xie | Junhong Liang | Mohammad Rustom Al Nasar | Preslav Nakov | Fajri Koto
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Muhammad Dehan Al Kautsar | Saeed Almheiri | Momina Ahsan | Bilal Elbouardi | Younes Samih | Sarfraz Ahmad | Amr Keleg | Omar El Herraoui | Kareem Elzeky | Abed Alhakim Freihat | Mohamed Anwar | Zhuohan Xie | Junhong Liang | Mohammad Rustom Al Nasar | Preslav Nakov | Fajri Koto
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
There is a significant gap in evaluating cultural reasoning in LLMs using conversational datasets that capture culturally rich and dialectal contexts. Most Arabic benchmarks focus on short text snippets in Modern Standard Arabic (MSA), overlooking the cultural nuances that naturally arise in dialogues. To address this gap, we introduce ArabCulture-Dialogue, a culturally grounded conversational dataset covering 13 Arabic-speaking countries, in both MSA and each country’s respective dialect, spanning 12 daily-life topics and 54 fine-grained subtopics. We utilize the dataset to form three benchmarking tasks: (i) multiple-choice cultural reasoning, (ii) machine translation between MSA and dialects, and (iii) dialect-steering generation. Our experiments indicate that the performance gap between MSA and Arabic dialects still exists, whereby the models perform worse on all three tasks in the dialectal setup, compared to the MSA one.
2025
Nile-Chat: Egyptian Language Models for Arabic and Latin Scripts
Guokan Shang | Hadi Abdine | Ahmad Chamma | Amr Mohamed | Mohamed Anwar | Abdelaziz Bounhar | Omar El Herraoui | Preslav Nakov | Michalis Vazirgiannis | Eric P. Xing
Proceedings of The Third Arabic Natural Language Processing Conference
Guokan Shang | Hadi Abdine | Ahmad Chamma | Amr Mohamed | Mohamed Anwar | Abdelaziz Bounhar | Omar El Herraoui | Preslav Nakov | Michalis Vazirgiannis | Eric P. Xing
Proceedings of The Third Arabic Natural Language Processing Conference
We introduce Nile-Chat-4B, 3x4B-A6B, and 12B, a collection of LLMs for Egyptian dialect, uniquely designed to understand and generate texts written in both Arabic and Latin scripts. Specifically, with Nile-Chat-3x4B-A6B, we introduce a novel language adaptation approach by leveraging the Branch-Train-MiX strategy to merge script-specialized experts, into a single MoE model. Our Nile-Chat models significantly outperform leading multilingual and Arabic LLMs, such as LLaMa, Jais, and ALLaM, on our newly introduced Egyptian evaluation benchmarks, which span both understanding and generative tasks. Notably, our 12B model delivers a 14.4% performance gain over Qwen2.5-14B-Instruct on Latin-script benchmarks. All our resources are publicly available. We believe this work presents a comprehensive methodology for adapting LLMs to a single language with dual-script usage, addressing an often overlooked aspect in contemporary LLM development.
BALSAM: A Platform for Benchmarking Arabic Large Language Models
Rawan Al-Matham | Kareem Darwish | Raghad Al-Rasheed | Waad Alshammari | Muneera Alhoshan | Amal Almazrua | Asma Al Wazrah | Mais Alheraki | Firoj Alam | Preslav Nakov | Norah Alzahrani | Eman AlBilali | Nizar Habash | Abdelrahman El-Sheikh | Muhammad Elmallah | Haonan Li | Hamdy Mubarak | Mohamed Anwar | Zaid Alyafeai | Ahmed Abdelali | Nora Altwairesh | Maram Hasanain | Abdulmohsen Al Thubaity | Shady Shehata | Bashar Alhafni | Injy Hamed | Go Inoue | Khalid Elmadani | Ossama Obeid | Fatima Haouari | Tamer Elsayed | Emad Alghamdi | Khalid Almubarak | Saied Alshahrani | Ola Aljarrah | Safa Alajlan | Areej Alshaqarawi | Maryam Alshihri | Sultana Alghurabi | Atikah Alzeghayer | Afrah Altamimi | Abdullah Alfaifi | Abdulrahman AlOsaimy
Proceedings of The Third Arabic Natural Language Processing Conference
Rawan Al-Matham | Kareem Darwish | Raghad Al-Rasheed | Waad Alshammari | Muneera Alhoshan | Amal Almazrua | Asma Al Wazrah | Mais Alheraki | Firoj Alam | Preslav Nakov | Norah Alzahrani | Eman AlBilali | Nizar Habash | Abdelrahman El-Sheikh | Muhammad Elmallah | Haonan Li | Hamdy Mubarak | Mohamed Anwar | Zaid Alyafeai | Ahmed Abdelali | Nora Altwairesh | Maram Hasanain | Abdulmohsen Al Thubaity | Shady Shehata | Bashar Alhafni | Injy Hamed | Go Inoue | Khalid Elmadani | Ossama Obeid | Fatima Haouari | Tamer Elsayed | Emad Alghamdi | Khalid Almubarak | Saied Alshahrani | Ola Aljarrah | Safa Alajlan | Areej Alshaqarawi | Maryam Alshihri | Sultana Alghurabi | Atikah Alzeghayer | Afrah Altamimi | Abdullah Alfaifi | Abdulrahman AlOsaimy
Proceedings of The Third Arabic Natural Language Processing Conference
The impressive advancement of Large Language Models (LLMs) in English has not been matched across all languages. In particular, LLM performance in Arabic lags behind, due to data scarcity, linguistic diversity of Arabic and its dialects, morphological complexity, etc. Progress is further hindered by the quality of Arabic benchmarks, which typically rely on static, publicly available data, lack comprehensive task coverage, or do not provide dedicated platforms with blind test sets. This makes it challenging to measure actual progress and to mitigate data contamination. Here, we aim to bridge these gaps. In particular, we introduce BALSAM, a comprehensive, community-driven benchmark aimed at advancing Arabic LLM development and evaluation. It includes 78 NLP tasks from 14 broad categories, with 52K examples divided into 37K test and 15K development, and a centralized, transparent platform for blind evaluation. We envision BALSAM as a unifying platform that sets standards and promotes collaborative research to advance Arabic LLM capabilities.
2024
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
HyoJung Han | Mohamed Anwar | Juan Pino | Wei-Ning Hsu | Marine Carpuat | Bowen Shi | Changhan Wang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
HyoJung Han | Mohamed Anwar | Juan Pino | Wei-Ning Hsu | Marine Carpuat | Bowen Shi | Changhan Wang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Speech recognition and translation systems perform poorly on noisy inputs, which are frequent in realistic environments. Augmenting these systems with visual signals has the potential to improve robustness to noise. However, audio-visual (AV) data is only available in limited amounts and for fewer languages than audio-only resources.To address this gap, we present XLAVS-R, a cross-lingual audio-visual speech representation model for noise-robust speech recognition and translation in over 100 languages. It is designed to maximize the benefits of limited multilingual AV pre-training data, by building on top of audio-only multilingual pre-training and simplifying existing pre-training schemes. Extensive evaluation on the MuAViC benchmark shows the strength of XLAVS-R on downstream audio-visual speech recognition and translation tasks, where it outperforms the previous state of the art by up to 18.5% WER and 4.7 BLEU given noisy AV inputs, and enables strong zero-shot audio-visual ability with audio-only fine-tuning.
Search
Fix author
Co-authors
- Preslav Nakov 3
- Omar El Herraoui 2
- Ahmed Abdelali 1
- Hadi Abdine 1
- Sarfraz Ahmad 1
- Momina Ahsan 1
- Muhammad Dehan Al Kautsar 1
- Mohammad Rustom Al Nasar 1
- Asma Al Wazrah 1
- Rawan Al-Matham 1
- Raghad Al-Rasheed 1
- Abdulmohsen Al-Thubaity 1
- Abdulrahman AlOsaimy 1
- Safa Alajlan 1
- Firoj Alam 1
- Eman Albilali 1
- Abdullah Alfaifi 1
- Emad Alghamdi 1
- Sultana Alghurabi 1
- Bashar Alhafni 1
- Mais Alheraki 1
- Muneera Alhoshan 1
- Ola Aljarrah 1
- Amal Almazrua 1
- Saeed Almheiri 1
- Khalid Almubarak 1
- Saied Alshahrani 1
- Waad Thuwaini Alshammari 1
- Areej Alshaqarawi 1
- Maryam Alshihri 1
- Afrah Altamimi 1
- Nora Altwairesh 1
- Zaid Alyafeai 1
- Norah A. Alzahrani 1
- Atikah Alzeghayer 1
- Abdelaziz Bounhar 1
- Marine Carpuat 1
- Ahmad Chamma 1
- Kareem Darwish 1
- Abdelrahman El-Sheikh 1
- Bilal Elbouardi 1
- Khalid Elmadani 1
- Muhammad Elmallah 1
- Tamer Elsayed 1
- Kareem Elzeky 1
- Abed Alhakim Freihat 1
- Nizar Habash 1
- Injy Hamed 1
- HyoJung Han 1
- Fatima Haouari 1
- Maram Hasanain 1
- Wei-Ning Hsu 1
- Go Inoue 1
- Amr Keleg 1
- Fajri Koto 1
- Haonan Li 1
- Junhong Liang 1
- Amr Mohamed 1
- Hamdy Mubarak 1
- Ossama Obeid 1
- Juan Pino 1
- Younes Samih 1
- Guokan Shang 1
- Shady Shehata 1
- Bowen Shi 1
- Michalis Vazirgiannis 1
- Changhan Wang 1
- Zhuohan Xie 1
- Eric Xing 1