PUMA: Projected Universal Multilingual ASR for Low-Resource Settings. Application to Diverse African Languages
Ilyes Oukid, Bilal Faye, Hanane Azzag, Mustapha Lebbah, Said Yacine Boulahia
Abstract
Multilingual ASR systems often fail to generalize to low-resource and linguistically diverse languages while remaining costly to scale. We introduce PUMA, a unified multilingual ASR model that improves low-resource performance with reduced model complexity. PUMA employs a Universal Language Projection (ULP) module that integrates a learnable language token with acoustic representations, enabling language-aware processing through shared parameters. Experiments on diverse African languages show consistent word error rate reductions over strong multilingual baselines, highlighting improved robustness and generalization. Our code is available at the following GitHub URL: https://github.com/ilyes-okd/PUMA- Anthology ID:
- 2026.findings-acl.17
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 371–382
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.17/
- DOI:
- Cite (ACL):
- Ilyes Oukid, Bilal Faye, Hanane Azzag, Mustapha Lebbah, and Said Yacine Boulahia. 2026. PUMA: Projected Universal Multilingual ASR for Low-Resource Settings. Application to Diverse African Languages. In Findings of the Association for Computational Linguistics: ACL 2026, pages 371–382, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- PUMA: Projected Universal Multilingual ASR for Low-Resource Settings. Application to Diverse African Languages (Oukid et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.17.pdf