Jakir Hasan
2026
BanglaIPA: Towards Robust Text-to-IPA Transcription with Contextual Rewriting in Bengali
Jakir Hasan | Shrestha Datta | Md Saiful Islam | Shubhashis Roy Dipta | Ameya Debnath
Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)
Jakir Hasan | Shrestha Datta | Md Saiful Islam | Shubhashis Roy Dipta | Ameya Debnath
Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)
Despite its widespread use, Bengali lacks a robust automated International Phonetic Alphabet (IPA) transcription system that effectively supports both standard language and regional dialectal texts. Existing approaches struggle to handle regional variations, numerical expressions, and generalize poorly to previously unseen words. To address these limitations, we propose BanglaIPA, a novel IPA generation system that integrates a character-based vocabulary with word-level alignment. The proposed system accurately handles Bengali numerals and demonstrates strong performance across regional dialects. BanglaIPA improves inference efficiency by leveraging a precomputed word-to-IPA mapping dictionary for previously observed words. The system is evaluated on the standard Bengali and six regional variations of the DUAL-IPA dataset. Experimental results show that BanglaIPA outperforms baseline IPA transcription models by 58.4-78.7% and achieves an overall mean word error rate of 11.4%, highlighting its robustness in phonetic transcription generation for the Bengali language.
2025
BanglaTalk: Towards Real-Time Speech Assistance for Bengali Regional Dialects
Jakir Hasan | Shubhashis Roy Dipta
Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025)
Jakir Hasan | Shubhashis Roy Dipta
Proceedings of the Second Workshop on Bangla Language Processing (BLP-2025)
Real-time speech assistants are becoming increasingly popular for ensuring improved accessibility to information. Bengali, being a low-resource language with a high regional dialectal diversity, has seen limited progress in developing such systems. Existing systems are not optimized for real-time use and focus only on standard Bengali. In this work, we present BanglaTalk, the first real-time speech assistance system for Bengali regional dialects. BanglaTalk follows the client-server architecture and uses the Real-time Transport Protocol (RTP) to ensure low-latency communication. To address dialectal variation, we introduce a dialect-aware ASR system, BRDialect, developed by fine-tuning the IndicWav2Vec model in ten Bengali regional dialects. It outperforms the baseline ASR models by 12.41-33.98% on the RegSpeech12 dataset. Furthermore, BanglaTalk can operate at a low bandwidth of 24 kbps while maintaining an average end-to-end delay of 4.9 seconds. Low bandwidth usage and minimal end-to-end delay make the system both cost-effective and interactive for real-time use cases, enabling inclusive and accessible speech technology for the diverse community of Bengali speakers.