NADI 2025: The First Multidialectal Arabic Speech Processing Shared Task
Bashar Talafha, Hawau Olamide Toyin, Peter Sullivan, AbdelRahim A. Elmadany, Abdurrahman Juma, Amirbek Djanibekov, Chiyu Zhang, Hamad Alshehhi, Hanan Aldarmaki, Mustafa Jarrar, Nizar Habash, Muhammad Abdul-Mageed
Abstract
We present the findings of the sixth Nuanced Arabic Dialect Identification (NADI 2025) Shared Task, which focused on Arabic speech dialect processing across three subtasks: spoken dialect identification (Subtask 1), speech recognition (Subtask 2), and diacritic restoration for spoken dialects (Subtask 3). A total of 44 teams registered, and during the testing phase, 100 valid submissions were received from eight unique teams. The distribution was as follows: 34 submissions for Subtask 1 five teams, 47 submissions for Subtask 2 six teams, and 19 submissions for Subtask 3 two teams. The best-performing systems achieved 79.8% accuracy on Subtask 1, 35.68/12.20 WER/CER (overall average) on Subtask 2, and 55/13 WER/CER on Subtask 3. These results highlight the ongoing challenges of Arabic dialect speech processing, particularly in dialect identification, recognition, and diacritic restoration. We also summarize the methods adopted by participating teams and briefly outline directions for future editions of NADI.- Anthology ID:
- 2025.arabicnlp-sharedtasks.99
- Volume:
- Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Kareem Darwish, Ahmed Ali, Ibrahim Abu Farha, Samia Touileb, Imed Zitouni, Ahmed Abdelali, Sharefah Al-Ghamdi, Sakhar Alkhereyf, Wajdi Zaghouani, Salam Khalifa, Badr AlKhamissi, Rawan Almatham, Injy Hamed, Zaid Alyafeai, Areeb Alowisheq, Go Inoue, Khalil Mrini, Waad Alshammari
- Venue:
- ArabicNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 720–733
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.arabicnlp-sharedtasks.99/
- DOI:
- Cite (ACL):
- Bashar Talafha, Hawau Olamide Toyin, Peter Sullivan, AbdelRahim A. Elmadany, Abdurrahman Juma, Amirbek Djanibekov, Chiyu Zhang, Hamad Alshehhi, Hanan Aldarmaki, Mustafa Jarrar, Nizar Habash, and Muhammad Abdul-Mageed. 2025. NADI 2025: The First Multidialectal Arabic Speech Processing Shared Task. In Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks, pages 720–733, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- NADI 2025: The First Multidialectal Arabic Speech Processing Shared Task (Talafha et al., ArabicNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.arabicnlp-sharedtasks.99.pdf