Thesis Proposal: Self-Adaptive and Epistemic Uncertainty-Guided ASR of Dense Intra-Sentential Code-Switched Speech for African Low-Resource Languages

Umar Baba Umar, Sulaimon Adebayo Bashir, Abdulmalik Danlami Mohammed, Amina Gogo Tafida


Abstract
Automatic Speech Recognition (ASR) has achieved strong performance for high-resource languages, but dense intra-sentential code-switched speech in African low-resource settings remains underexplored. Existing multilingual and pretrained ASR systems improve general recognition accuracy, yet they remain weak at switch regions, are sensitive to language imbalance during adaptation, and are typically evaluated with metrics that obscure switching-specific errors. This thesis proposes a self-adaptive and epistemic uncertainty-guided framework for African low-resource code-switched ASR, using Hausa–English (Engausa) and Hausa–Yorùbá as case studies. The work investigates three linked questions: (1) how to design a linguistically informed code-switched corpus with explicit switch-region annotation and labeled/unlabeled partitions for adaptive learning, (2) whether epistemic uncertainty is systematically elevated around switch regions and can guide pseudo-label selection in semi-supervised training, and (3) whether switch-aware adaptation with auxiliary language identification and boundary supervision can reduce recognition errors without increasing catastrophic forgetting. The long-term goal is to develop scalable and data-efficient ASR systems that model code-switching as a structured linguistic phenomenon rather than as noise in multilingual African speech.
Anthology ID:
2026.acl-srw.67
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
754–763
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.67/
DOI:
Bibkey:
Cite (ACL):
Umar Baba Umar, Sulaimon Adebayo Bashir, Abdulmalik Danlami Mohammed, and Amina Gogo Tafida. 2026. Thesis Proposal: Self-Adaptive and Epistemic Uncertainty-Guided ASR of Dense Intra-Sentential Code-Switched Speech for African Low-Resource Languages. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 754–763, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Thesis Proposal: Self-Adaptive and Epistemic Uncertainty-Guided ASR of Dense Intra-Sentential Code-Switched Speech for African Low-Resource Languages (Umar et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.67.pdf