ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations
Neil Shah, Saiteja Kosgi, Vishal Tambrahalli, Neha S, Anil Nelakanti, Vineet Gandhi
Abstract
We present ParrotTTS, a modularized text-to-speech synthesis model leveraging disentangled self-supervised speech representations. It can train a multi-speaker variant effectively using transcripts from a single speaker. ParrotTTS adapts to a new language in low resource setup and generalizes to languages not seen while training the self-supervised backbone. Moreover, without training on bilingual or parallel examples, ParrotTTS can transfer voices across languages while preserving the speaker-specific characteristics, e.g., synthesizing fluent Hindi speech using a French speaker’s voice and accent. We present extensive results in monolingual and multi-lingual scenarios. ParrotTTS outperforms state-of-the-art multi-lingual text-to-speech (TTS) models using only a fraction of paired data as latter. Speech samples from ParrotTTS and code can be found at https://parrot-tts.github.io/tts/- Anthology ID:
- 2024.findings-eacl.6
- Volume:
- Findings of the Association for Computational Linguistics: EACL 2024
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian’s, Malta
- Editors:
- Yvette Graham, Matthew Purver
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 79–91
- Language:
- URL:
- https://aclanthology.org/2024.findings-eacl.6
- DOI:
- Cite (ACL):
- Neil Shah, Saiteja Kosgi, Vishal Tambrahalli, Neha S, Anil Nelakanti, and Vineet Gandhi. 2024. ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations. In Findings of the Association for Computational Linguistics: EACL 2024, pages 79–91, St. Julian’s, Malta. Association for Computational Linguistics.
- Cite (Informal):
- ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations (Shah et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.findings-eacl.6.pdf