Yunpeng Liu
2025
MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition
Yaqi Chen
|
Hao Zhang
|
Wenlin Zhang
|
XuKui Yang
|
Dan Qu
|
Yunpeng Liu
Findings of the Association for Computational Linguistics: EMNLP 2025
Meta-learning has proven to be a powerful paradigm for effectively improving the performance of low-resource speech recognition by learning generalizable knowledge across multiple tasks. However, multilingual meta learning also faces challenges such as task overfitting and learner overfitting, thereby reducing its ability to generalize to new tasks. To address these issues, we augment the meta-training task with “more data” during both training and evaluation phases. Concretely, we propose an interpolation-based task augmentation method called MetaMixSpeech, which includes both support augmentation and query augmentation. MetaMixSpeech enhances task diversity by linearly combining perturbed features from the support and query sets and performing the same linear interpolation on their corresponding losses. Experimental results on the FLEURS and Common Voice datasets demonstrate that MetaMixSpeech achieves a 6.35 % improvement in Word Error Rate (WER) compared to meta-learning approaches, effectively mitigating the overfitting problem and showcasing superior generalization across diverse datasets and language families.