Arab Voices: Mapping Standard and Dialectal Arabic Speech Technology
Peter Sullivan, AbdelRahim A. Elmadany, Alcides Alcoba Inciarte, Muhammad Abdul-Mageed
Abstract
Dialectal Arabic datasets embody a range of domain, dialect, and quality. To better understand the landscape of these datasets, we perform a computational analysis of the ‘dialectness’ and a set of measures of audio quality. This analysis of the training splits of dialectal Arabic datasets, provides a valuable complement to existing literature surveys of dialectal Arabic.To further address inconsistencies between datasets, we also introduce Arab Voices, a standardized framework for supporting Automatic Speech Recognition in dialectal Arabic. This framework provide access to 31 datasets covering 14 dialects, to better address the limited data availability encountered in dialectal Arabic speech processing. Our benchmark further provides a current evaluation of SOTA tools as well as modern multimodal LLMs at dialectal Arabic ASR.- Anthology ID:
- 2026.findings-acl.575
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11843–11878
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.575/
- DOI:
- Cite (ACL):
- Peter Sullivan, AbdelRahim A. Elmadany, Alcides Alcoba Inciarte, and Muhammad Abdul-Mageed. 2026. Arab Voices: Mapping Standard and Dialectal Arabic Speech Technology. In Findings of the Association for Computational Linguistics: ACL 2026, pages 11843–11878, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Arab Voices: Mapping Standard and Dialectal Arabic Speech Technology (Sullivan et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.575.pdf