Peter Sullivan


2023

pdf
VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Abdul Waheed | Bashar Talafha | Peter Sullivan | AbdelRahim Elmadany | Muhammad Abdul-Mageed
Proceedings of ArabicNLP 2023

Arabic is a complex language with many varieties and dialects spoken by ~ 450 millions all around the world. Due to the linguistic diversity and vari-ations, it is challenging to build a robust and gen-eralized ASR system for Arabic. In this work, we address this gap by developing and demoing a system, dubbed VoxArabica, for dialect identi-fication (DID) as well as automatic speech recog-nition (ASR) of Arabic. We train a wide range of models such as HuBERT (DID), Whisper, and XLS-R (ASR) in a supervised setting for Arabic DID and ASR tasks. Our DID models are trained to identify 17 different dialects in addition to MSA. We finetune our ASR models on MSA, Egyptian, Moroccan, and mixed data. Additionally, for the re-maining dialects in ASR, we provide the option to choose various models such as Whisper and MMS in a zero-shot setting. We integrate these models into a single web interface with diverse features such as audio recording, file upload, model selec-tion, and the option to raise flags for incorrect out-puts. Overall, we believe VoxArabica will be use-ful for a wide range of audiences concerned with Arabic research. Our system is currently running at https://cdce-206-12-100-168.ngrok.io/.

2021

pdf bib
Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English
Toshiko Shibano | Xinyi Zhang | Mia Taige Li | Haejin Cho | Peter Sullivan | Muhammad Abdul-Mageed
Proceedings of the 4th International Conference on Natural Language and Speech Processing (ICNLSP 2021)