MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition
Yaqi Chen, Hao Zhang, Wenlin Zhang, XuKui Yang, Dan Qu, Yunpeng Liu
Abstract
Meta-learning has proven to be a powerful paradigm for effectively improving the performance of low-resource speech recognition by learning generalizable knowledge across multiple tasks. However, multilingual meta learning also faces challenges such as task overfitting and learner overfitting, thereby reducing its ability to generalize to new tasks. To address these issues, we augment the meta-training task with “more data” during both training and evaluation phases. Concretely, we propose an interpolation-based task augmentation method called MetaMixSpeech, which includes both support augmentation and query augmentation. MetaMixSpeech enhances task diversity by linearly combining perturbed features from the support and query sets and performing the same linear interpolation on their corresponding losses. Experimental results on the FLEURS and Common Voice datasets demonstrate that MetaMixSpeech achieves a 6.35 % improvement in Word Error Rate (WER) compared to meta-learning approaches, effectively mitigating the overfitting problem and showcasing superior generalization across diverse datasets and language families.- Anthology ID:
- 2025.findings-emnlp.202
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3769–3779
- Language:
- URL:
- https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.202/
- DOI:
- 10.18653/v1/2025.findings-emnlp.202
- Cite (ACL):
- Yaqi Chen, Hao Zhang, Wenlin Zhang, XuKui Yang, Dan Qu, and Yunpeng Liu. 2025. MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 3769–3779, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition (Chen et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.202.pdf