Yisi Liu
2025
RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding
Yisi Liu
|
Chenyang Wang
|
Hanjo Kim
|
Raniya Khan
|
Gopala Anumanchipalli
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Voice conversion has emerged as a pivotal technology in numerous applications ranging from assistive communication to entertainment. In this paper, we present RT-VC, a zero-shot real-time voice conversion system that delivers ultra-low latency and high-quality performance. Our approach leverages an articulatory feature space to naturally disentangle content and speaker characteristics, facilitating more robust and interpretable voice transformations. Additionally, the integration of differentiable digital signal processing (DDSP) enables efficient vocoding directly from articulatory features, significantly reducing conversion latency. Experimental evaluations demonstrate that, while maintaining synthesis quality comparable to the current state-of-the-art (SOTA) method, RT-VC achieves a CPU latency of 61.4 ms, representing a 13.3% reduction in latency.
2015
The University of Illinois submission to the WMT 2015 Shared Translation Task
Lane Schwartz
|
Bill Bryce
|
Chase Geigle
|
Sean Massung
|
Yisi Liu
|
Haoruo Peng
|
Vignesh Raja
|
Subhro Roy
|
Shyam Upadhyay
Proceedings of the Tenth Workshop on Statistical Machine Translation
Search
Fix author
Co-authors
- Gopala Anumanchipalli 1
- Bill Bryce 1
- Chase Geigle 1
- Raniya Khan 1
- Hanjo Kim 1
- show all...