A Whisper-Based System with Multi-Faceted Data Augmentation for Low-Resource Language
Pin-Cheng Chen, Yu-Chi Chen, Chia-Chun Liang, Cheng-Yu Lin, Ping-Juei Tsai, Wei-Yun Ma
Abstract
This paper presents a comprehensive approach for the Formosa Speech Recognition Challenge 2025 (FSR-2025), targeting automatic speech recognition (ASR) for the under-resourced Dapu and Zhao’an dialects of Taiwanese Hakka. Our method integrates data augmentation and robustness techniques, including SpecAugment, dialect-aware special tokens, text-to-speech (TTS) augmentation, noise/reverberation mixing, and speed perturbation, to mitigate data scarcity and domain mismatch. Experiments on the official FSR-2025 datasets show consistent improvements in both character error rate (CER) and word error rate (WER). Extensive ablation studies further confirm that each component contributes positively. These results offer a practical path toward robust ASR for under-resourced Hakka dialects and suggest broader applicability to other low-resource languages.- Anthology ID:
- 2025.rocling-main.59
- Volume:
- Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)
- Month:
- November
- Year:
- 2025
- Address:
- National Taiwan University, Taipei City, Taiwan
- Editors:
- Kai-Wei Chang, Ke-Han Lu, Chih-Kai Yang, Zhi-Rui Tam, Wen-Yu Chang, Chung-Che Wang
- Venue:
- ROCLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 489–498
- Language:
- URL:
- https://preview.aclanthology.org/dashboard/2025.rocling-main.59/
- DOI:
- Cite (ACL):
- Pin-Cheng Chen, Yu-Chi Chen, Chia-Chun Liang, Cheng-Yu Lin, Ping-Juei Tsai, and Wei-Yun Ma. 2025. A Whisper-Based System with Multi-Faceted Data Augmentation for Low-Resource Language. In Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025), pages 489–498, National Taiwan University, Taipei City, Taiwan. Association for Computational Linguistics.
- Cite (Informal):
- A Whisper-Based System with Multi-Faceted Data Augmentation for Low-Resource Language (Chen et al., ROCLING 2025)
- PDF:
- https://preview.aclanthology.org/dashboard/2025.rocling-main.59.pdf