Horst-Udo Hain

Also published as: H.-U. Hain

2008

pdf abs
Evaluation of Modules and Tools for Speech Synthesis: the ECESS Framework
Harald Höge | Zdravko Kacic | Bojan Kotnik | Matej Rojc | Nicolas Moreau | Horst-Udo Hain
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

The consortium ECESS (European Center of Excellence for Speech Synthesis) has set up a framework for evaluation of software modules and tools relevant for speech synthesis. Till now two lines of evaluation campaigns have been established: (1) Evaluation of the ECESS TTS modules (text processing, prosody, acoustic synthesis). (2) Evaluation of ECESS tools (pitch extraction, voice activity detection, phonetic segmentation). The functionality and interfaces of the ECESS TTS have been developed by a joint effort between ECESS and the EC-funded project TC-STAR . First evaluation campaigns were conducted within TC-STAR using the ECESS framework. As TC-STAR finished in March 2007, ECESS continued and extended the evaluation of ECESS TTS modules and tools by its own. Within the paper we describe a novel framework which allows performing remote evaluation for modules via the web. First experimental results are reported. Further the result of several evaluation campaigns for tools handling pitch extraction and voice activity detection are presented.

2006

pdf abs
ECESS Inter-Module Interface Specification for Speech Synthesis
Javier Pérez | Antonio Bonafonte | Horst-Udo Hain | Eric Keller | Stefan Breuer | Jilei Tian
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The newly founded European Centre of Excellence for Speech Synthesis (ECESS) is an initiative to promote the development of the European research area (ERA) in the field of Language Technology. ECESS focuses on the great challenge of high-quality speech synthesis which is of crucial importance for future spoken-language technologies. The main goals of ECESS are to achieve the critical mass needed to promote progress in TTS technology substantially, to integrate basic research know-how related to speech synthesis and to attract public and private funding. To this end, a common system architecture based on exchangeable modules supplied by the ECESS members is to be established. The XML-based interface that connects these modules is the topic of this paper.

In the framework of the EU funded project TC-STAR (Technology and Corpora for Speech to Speech Translation),research on TTS aims on providing a synthesized voice sounding like the source speaker speaking the target language. To progress in this direction, research is focused on naturalness, intelligibility, expressivity and voice conversion both, in the TC-STAR framework. For this purpose, specifications on large, high quality TTS databases have been developed and the data have been recorded for UK English, Spanish and Mandarin. The development of speech technology in TC-STAR is evaluation driven. Assessment of speech synthesis is needed to determine how well a system or technique performs in comparison to previous versions as well as other approaches (systems & methods). Apart from testing the whole system, all components of the system will be evaluated separately. This approach grants better assesment of each component as well as identification of the best techniques in the different speech synthesisprocesses.This paper describes the specifications of Language Resources for speech synthesis and the specifications for evaluation of speech synthesis activities.

Co-authors

Henk van den Heuvel 1

Venues

lrec3