Bridging Dialectal Gaps in Arabic Medical LLMs through Model Merging

Ahmed Ibrahim, Abdullah Hosseini, Hoda Helmy, Wafa Lakhdhar, Ahmed Serag


Abstract
The linguistic fragmentation of Arabic, with over 30 dialects exhibiting low mutual intelligibility, presents a critical challenge for deploying natural language processing (NLP) in healthcare. Conventional fine-tuning of large language models (LLMs) for each dialect is computationally prohibitive and operationally unsustainable. In this study, we explore model merging as a scalable alternative by integrating three pre-trained LLMs—a medical domain expert, an Egyptian Arabic model, and a Moroccan Darija model—into a unified system without additional fine-tuning. We introduce a novel evaluation framework that assesses both dialectal fidelity via dual evaluation: LLM-based automated scoring and human assessments by native speakers. Our results demonstrate that the merged model effectively handles cross-dialect medical scenarios, such as interpreting Moroccan Darija inputs for Egyptian Arabic-speaking clinicians, while maintaining high clinical relevance. The merging process reduced computational cost by over 60% compared to per-dialect fine-tuning, highlighting its viability for resource-constrained settings. This work offers a promising path for building dialect-aware medical LLMs at scale, with implications for broader deployment across linguistically diverse regions.
Anthology ID:
2025.arabicnlp-main.27
Volume:
Proceedings of The Third Arabic Natural Language Processing Conference
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Kareem Darwish, Ahmed Ali, Ibrahim Abu Farha, Samia Touileb, Imed Zitouni, Ahmed Abdelali, Sharefah Al-Ghamdi, Sakhar Alkhereyf, Wajdi Zaghouani, Salam Khalifa, Badr AlKhamissi, Rawan Almatham, Injy Hamed, Zaid Alyafeai, Areeb Alowisheq, Go Inoue, Khalil Mrini, Waad Alshammari
Venue:
ArabicNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
338–346
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.arabicnlp-main.27/
DOI:
Bibkey:
Cite (ACL):
Ahmed Ibrahim, Abdullah Hosseini, Hoda Helmy, Wafa Lakhdhar, and Ahmed Serag. 2025. Bridging Dialectal Gaps in Arabic Medical LLMs through Model Merging. In Proceedings of The Third Arabic Natural Language Processing Conference, pages 338–346, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Bridging Dialectal Gaps in Arabic Medical LLMs through Model Merging (Ibrahim et al., ArabicNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.arabicnlp-main.27.pdf