MoE-SLU: Towards ASR-Robust Spoken Language Understanding via Mixture-of-Experts
Xuxin Cheng, Zhihong Zhu, Xianwei Zhuang, Zhanpeng Chen, Zhiqi Huang, Yuexian Zou
Abstract
As a crucial task in the task-oriented dialogue systems, spoken language understanding (SLU) has garnered increasing attention. However, errors from automatic speech recognition (ASR) often hinder the performance of understanding. To tackle this problem, we propose MoE-SLU, an ASR-Robust SLU framework based on the mixture-of-experts technique. Specifically, we first introduce three strategies to generate additional transcripts from clean transcripts. Then, we employ the mixture-of-experts technique to weigh the representations of the generated transcripts, ASR transcripts, and the corresponding clean manual transcripts. Additionally, we also regularize the weighted average of predictions and the predictions of ASR transcripts by minimizing the Jensen-Shannon Divergence (JSD) between these two output distributions. Experiment results on three benchmark SLU datasets demonstrate that our MoE-SLU achieves state-of-the-art performance. Further model analysis also verifies the superiority of our method.- Anthology ID:
- 2024.findings-acl.882
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 14868–14879
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2024.findings-acl.882/
- DOI:
- 10.18653/v1/2024.findings-acl.882
- Cite (ACL):
- Xuxin Cheng, Zhihong Zhu, Xianwei Zhuang, Zhanpeng Chen, Zhiqi Huang, and Yuexian Zou. 2024. MoE-SLU: Towards ASR-Robust Spoken Language Understanding via Mixture-of-Experts. In Findings of the Association for Computational Linguistics: ACL 2024, pages 14868–14879, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- MoE-SLU: Towards ASR-Robust Spoken Language Understanding via Mixture-of-Experts (Cheng et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2024.findings-acl.882.pdf