Kunnafonidilaw ka Cadeau: an ASR dataset of present-day Bambara
Michael Leventhal, Yacouba Diarra, Nouhoum Coulibaly, Panga Azazia Kamaté
Abstract
We present Kunkado, a 160-hour Bambara ASR dataset compiled from Malian radio archives to capture present-day spontaneous speech across a wide range of topics. It includes code-switching, disfluencies, background noise, and overlapping speakers that practical ASR systems encounter in real-world use. We finetuned Parakeet-based models on a 33.47-hour human-reviewed subset and apply pragmatic transcript normalization to reduce variability in number formatting, tags, and code-switching annotations. Evaluated on two real-world test sets, finetuning with Kunkado reduces WER from 44.47% to 37.12% on one and from 36.07% to 32.33% on the other. In human evaluation, the resulting model also outperforms a comparable system with the same architecture trained on 98 hours of cleaner, less realistic speech. We release the data and models to support robust ASR for predominantly oral languages.- Anthology ID:
- 2026.africanlp-main.18
- Volume:
- Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Everlyn Asiko Chimoto, Constantine Lignos, Shamsuddeen Muhammad, Idris Abdulmumin, Clemencia Siro, David Ifeoluwa Adelani
- Venues:
- AfricaNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 190–196
- Language:
- URL:
- https://preview.aclanthology.org/manual-author-scripts/2026.africanlp-main.18/
- DOI:
- Cite (ACL):
- Michael Leventhal, Yacouba Diarra, Nouhoum Coulibaly, and Panga Azazia Kamaté. 2026. Kunnafonidilaw ka Cadeau: an ASR dataset of present-day Bambara. In Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026), pages 190–196, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- Kunnafonidilaw ka Cadeau: an ASR dataset of present-day Bambara (Leventhal et al., AfricaNLP 2026)
- PDF:
- https://preview.aclanthology.org/manual-author-scripts/2026.africanlp-main.18.pdf