MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition

Yaqi Chen, Hao Zhang, Wenlin Zhang, XuKui Yang, Dan Qu, Yunpeng Liu


Abstract
Meta-learning has proven to be a powerful paradigm for effectively improving the performance of low-resource speech recognition by learning generalizable knowledge across multiple tasks. However, multilingual meta learning also faces challenges such as task overfitting and learner overfitting, thereby reducing its ability to generalize to new tasks. To address these issues, we augment the meta-training task with “more data” during both training and evaluation phases. Concretely, we propose an interpolation-based task augmentation method called MetaMixSpeech, which includes both support augmentation and query augmentation. MetaMixSpeech enhances task diversity by linearly combining perturbed features from the support and query sets and performing the same linear interpolation on their corresponding losses. Experimental results on the FLEURS and Common Voice datasets demonstrate that MetaMixSpeech achieves a 6.35 % improvement in Word Error Rate (WER) compared to meta-learning approaches, effectively mitigating the overfitting problem and showcasing superior generalization across diverse datasets and language families.
Anthology ID:
2025.findings-emnlp.202
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3769–3779
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.202/
DOI:
10.18653/v1/2025.findings-emnlp.202
Bibkey:
Cite (ACL):
Yaqi Chen, Hao Zhang, Wenlin Zhang, XuKui Yang, Dan Qu, and Yunpeng Liu. 2025. MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 3769–3779, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition (Chen et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.202.pdf
Checklist:
 2025.findings-emnlp.202.checklist.pdf