Development of Mandarin-English code-switching speech synthesis system

Hsin-Jou Lien, Li-Yu Huang, Chia-Ping Chen


Abstract
In this paper, the Mandarin-English code-switching speech synthesis system has been proposed. To focus on learning the content information between two languages, the training dataset is multilingual artificial dataset whose speaker style is unified. Adding language embedding into the system helps it be more adaptive to multilingual dataset. Besides, text preprocessing is applied and be used in different way which depends on the languages. Word segmentation and text-to-pinyin are the text preprocessing for Mandarin, which not only improves the fluency but also reduces the learning complexity. Number normalization decides whether the arabic numerals in sentence needs to add the digits. The preprocessing for English is acronym conversion which decides the pronunciation of acronym.
Anthology ID:
2022.rocling-1.15
Volume:
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)
Month:
November
Year:
2022
Address:
Taipei, Taiwan
Editors:
Yung-Chun Chang, Yi-Chin Huang
Venue:
ROCLING
SIG:
Publisher:
The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Note:
Pages:
116–120
Language:
Chinese
URL:
https://aclanthology.org/2022.rocling-1.15
DOI:
Bibkey:
Cite (ACL):
Hsin-Jou Lien, Li-Yu Huang, and Chia-Ping Chen. 2022. Development of Mandarin-English code-switching speech synthesis system. In Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), pages 116–120, Taipei, Taiwan. The Association for Computational Linguistics and Chinese Language Processing (ACLCLP).
Cite (Informal):
Development of Mandarin-English code-switching speech synthesis system (Lien et al., ROCLING 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2022.rocling-1.15.pdf