Abstract
Fine tuning of the pre-trained language model is a technique which can be used to enhance the technologies of low-resourced languages. The unsupervised approach can fine-tune any pre-trained model with minimum or even no language-specific resources. It is highly advantageous, particularly for languages that possess limited computational resources. We present a novel approach for fine-tuning a pre-trained Automatic Speech Recognition (ASR) model that is suitable for low resource languages. Our methods involves iterative fine-tuning of pre-trained ASR model. mms-1b is selected as the pretrained seed model for fine-tuning. We take the Nepali language as a case study for this research work. Our approach achieved a CER of 6.77%, outperforming all previously recorded CER values for the Nepali ASR Systems.- Anthology ID:
- 2023.icon-1.9
- Volume:
- Proceedings of the 20th International Conference on Natural Language Processing (ICON)
- Month:
- December
- Year:
- 2023
- Address:
- Goa University, Goa, India
- Editors:
- Jyoti D. Pawar, Sobha Lalitha Devi
- Venue:
- ICON
- SIG:
- SIGLEX
- Publisher:
- NLP Association of India (NLPAI)
- Note:
- Pages:
- 82–89
- Language:
- URL:
- https://aclanthology.org/2023.icon-1.9
- DOI:
- Cite (ACL):
- Rupak Raj Ghimire, Bal Krishna Bal, and Prakash Poudyal. 2023. Active Learning Approach for Fine-Tuning Pre-Trained ASR Model for a Low-Resourced Language: A Case Study of Nepali. In Proceedings of the 20th International Conference on Natural Language Processing (ICON), pages 82–89, Goa University, Goa, India. NLP Association of India (NLPAI).
- Cite (Informal):
- Active Learning Approach for Fine-Tuning Pre-Trained ASR Model for a Low-Resourced Language: A Case Study of Nepali (Ghimire et al., ICON 2023)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/2023.icon-1.9.pdf