Thomas Okken


2012

pdf
Building Text-To-Speech Voices in the Cloud
Alistair Conkie | Thomas Okken | Yeon-Jun Kim | Giuseppe Di Fabbrizio
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The AT&T VoiceBuilder provides a new tool to researchers and practitioners who want to have their voices synthesized by a high-quality commercial-grade text-to-speech system without the need to install, configure, or manage speech processing software and equipment. It is implemented as a web service on the AT&T Speech Mashup Portal.The system records and validates users' utterances, processes them to build a synthetic voice and provides a web service API to make the voice available to real-time applications through a scalable cloud-based processing platform. All the procedures are automated to avoid human intervention. We present experimental comparisons of voices built using the system.