Hou-An Lin

2021

pdf bib abs
Exploiting Low-Resource Code-Switching Data to Mandarin-English Speech Recognition Systems
Hou-An Lin | Chia-Ping Chen
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)

In this paper, we investigate how to use limited code-switching data to implement a code-switching speech recognition system. We utilize the Transformer end-to-end model to develop our code switching speech recognition system, which is trained with the Mandarin dataset and a small amount of Mandarin-English code switching dataset, as the baseline of this paper. Next, we compare the performance of systems after adding multi-task learning and transfer learning. Character Error Rate(CER) is adopted as the criterion for the system. Finally, we combined the three systems with the language model, respectively, our best result dropped to 23.9% compared with the baseline of 28.7%.

Co-authors

Chia-Ping Chen 1

Venues

ROCLING1