Abstract
Recognizing code-switching (CS) speech often presents challenges for an automatic speech recognition system (ASR) due to limited linguistic context in short monolingual segments, resulting in language confusion. To mitigate this issue, language identity (LID) is often integrated into the speech recognition system to provide additional linguistic context. However, previous works predominately focus on extracting language identity from speech signals. We introduce a novel approach to learn language identity from pure text data via a dedicated language identity-language model. Besides, we explore two strategies: LID state fusion and language posterior biasing, to integrate the text-derived language identities into the end-to-end ASR system. By incorporating hypothesized language identities, our ASR system gains crucial contextual cues, effectively capturing language transitions and patterns within code-switched utterances. We conduct speech recognition experiments on the SEAME corpus and demonstrate the effectiveness of our proposed methods. Our results reveal significantly improved transcriptions in code-switching scenarios, underscoring the potential of text-derived LID in enhancing code-switching speech recognition.- Anthology ID:
- 2023.calcs-1.4
- Volume:
- Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Genta Winata, Sudipta Kar, Marina Zhukova, Thamar Solorio, Mona Diab, Sunayana Sitaram, Monojit Choudhury, Kalika Bali
- Venues:
- CALCS | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 33–42
- Language:
- URL:
- https://aclanthology.org/2023.calcs-1.4
- DOI:
- 10.18653/v1/2023.calcs-1.4
- Cite (ACL):
- Qinyi Wang and Haizhou Li. 2023. Text-Derived Language Identity Incorporation for End-to-End Code-Switching Speech Recognition. In Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching, pages 33–42, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Text-Derived Language Identity Incorporation for End-to-End Code-Switching Speech Recognition (Wang & Li, CALCS-WS 2023)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2023.calcs-1.4.pdf