Non-parallel Accent Transfer based on Fine-grained Controllable Accent Modelling

Linqin Wang, Zhengtao Yu, Yuanzhang Yang, Shengxiang Gao, Cunli Mao, Yuxin Huang


Abstract
Existing accent transfer works rely on parallel data or speech recognition models. This paper focuses on the practical application of accent transfer and aims to implement accent transfer using non-parallel datasets. The study has encountered the challenge of speech representation disentanglement and modeling accents. In our accent modeling transfer framework, we manage to solve these problems by two proposed methods. First, we learn the suprasegmental information associated with tone to finely model the accents in terms of tone and rhythm. Second, we propose to use mutual information learning to disentangle the accent features and control the accent of the generated speech during the inference time. Experiments show that the proposed framework attains superior performance to the baseline models in terms of accentedness and audio quality.
Anthology ID:
2023.findings-emnlp.622
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9288–9298
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.622
DOI:
10.18653/v1/2023.findings-emnlp.622
Bibkey:
Cite (ACL):
Linqin Wang, Zhengtao Yu, Yuanzhang Yang, Shengxiang Gao, Cunli Mao, and Yuxin Huang. 2023. Non-parallel Accent Transfer based on Fine-grained Controllable Accent Modelling. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9288–9298, Singapore. Association for Computational Linguistics.
Cite (Informal):
Non-parallel Accent Transfer based on Fine-grained Controllable Accent Modelling (Wang et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2023.findings-emnlp.622.pdf