Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR
Abhishek Gupta, Amruta Parulekar, Sameep Chattopadhyay, Preethi Jyothi
Abstract
Automatic speech recognition (ASR) for low-resource languages remains a challenge due to the scarcity of labeled training data. Parameter-efficient fine-tuning and text-only adaptation are two popular methods that have been used to address such low-resource settings. In this work, we investigate how these techniques can be effectively combined using a multilingual multimodal model like SeamlessM4T. Multimodal models are able to leverage unlabeled text via text-only adaptation with further parameter-efficient ASR fine-tuning, thus boosting ASR performance. We also show cross-lingual transfer from a high-resource language, achieving up to a relative 17% WER reduction over baseline in an extremely low-resource setting without any labeled speech.- Anthology ID:
- 2024.mrl-1.13
- Volume:
- Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024)
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Jonne Sälevä, Abraham Owodunni
- Venues:
- MRL | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 175–185
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2024.mrl-1.13/
- DOI:
- 10.18653/v1/2024.mrl-1.13
- Cite (ACL):
- Abhishek Gupta, Amruta Parulekar, Sameep Chattopadhyay, and Preethi Jyothi. 2024. Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR. In Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024), pages 175–185, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR (Gupta et al., MRL 2024)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2024.mrl-1.13.pdf