Jon Sánchez

Also published as: J. Sánchez, Jon Sanchez


2012

pdf
Versatile Speech Databases for High Quality Synthesis for Basque
Iñaki Sainz | Daniel Erro | Eva Navas | Inma Hernáez | Jon Sanchez | Ibon Saratxaga | Igor Odriozola
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents three new speech databases for standard Basque. They are designed primarily for corpus-based synthesis but each database has its specific purpose: 1) AhoSyn: high quality speech synthesis (recorded also in Spanish), 2) AhoSpeakers: voice conversion and 3) AhoEmo3: emotional speech synthesis. The whole corpus design and the recording process are described with detail. Once the databases were collected all the data was automatically labelled and annotated. Then, an HMM-based TTS voice was built and subjectively evaluated. The results of the evaluation are pretty satisfactory: 3.70 MOS for Basque and 3.44 for Spanish. Therefore, the evaluation assesses the quality of this new speech resource and the validity of the automated processing presented.

pdf
Using an ASR database to design a pronunciation evaluation system in Basque
Igor Odriozola | Eva Navas | Inma Hernaez | Iñaki Sainz | Ibon Saratxaga | Jon Sánchez | Daniel Erro
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents a method to build CAPT systems for under resourced languages, as Basque, using a general purpose ASR speech database. More precisely, the proposed method consists in automatically determine the threshold of GOP (Goodness Of Pronunciation) scores, which have been used as pronunciation scores in phone-level. Two score distributions have been obtained for each phoneme corresponding to its correct and incorrect pronunciations. The distribution of the scores for erroneous pronunciation has been calculated inserting controlled errors in the dictionary, so that each changed phoneme has been randomly replaced by a phoneme from the same group. These groups have been obtained by means of a phonetic clustering performed using regression trees. After obtaining both distributions, the EER (Equal Error Rate) of each distribution pair has been calculated and used as a decision threshold for each phoneme. The results show that this method is useful when there is no database specifically designed for CAPT systems, although it is not as accurate as those specifically designed for this purpose.

2010

pdf
AhoTransf: A Tool for Multiband Excitation Based Speech Analysis and Modification
Ibon Saratxaga | Inmaculada Hernáez | Eva Navas | Iñaki Sainz | Iker Luengo | Jon Sánchez | Igor Odriozola | Daniel Erro
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper we present AhoTransf, a tool that enables analysis, visualization, modification and synthesis of speech. AhoTransf integrates a speech signal analysis model with a graphical user interface to allow visualization and modification of the parameters of the model. The synthesis capability allows hearing the modified signal thus providing a quick way to understand the perceptual effect of the changes in the parameters of the model. The speech analysis/synthesis algorithm is based in the Multiband Excitation technique, but uses a novel phase information representation the Relative Phase Shift (RPS’s). With this representation, not only the amplitudes but also the phases of the harmonic components of the speech signal reveal their structured patterns in the visualization tool. AhoTransf is modularly conceived so that it can be used with different harmonic speech models.

2008

pdf
Text Independent Speaker Identification in Multilingual Environments
Iker Luengo | Eva Navas | Iñaki Sainz | Ibon Saratxaga | Jon Sanchez | Igor Odriozola | Inma Hernaez
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Speaker identification and verification systems have a poor performance when model training is done in one language while the testing is done in another. This situation is not unusual in multilingual environments, where people should be able to access the system in any language he or she prefers in each moment, without noticing a performance drop. In this work we study the possibility of using features derived from prosodic parameters in order to reinforce the language robustness of these systems. First the features’ properties in terms of language and session variability are studied, predicting an increase in the language robustness when frame-wise intonation and energy values are combined with traditional MFCC features. The experimental results confirm that these features provide an improvement in the speaker recognition rates under language-mismatch conditions. The whole study is carried out in the Basque Country, a bilingual region in which Basque and Spanish languages co-exist.

pdf
Subjective Evaluation of an Emotional Speech Database for Basque
Iñaki Sainz | Ibon Saratxaga | Eva Navas | Inmaculada Hernáez | Jon Sanchez | Iker Luengo | Igor Odriozola
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper describes the evaluation process of an emotional speech database recorded for standard Basque, in order to determine its adequacy for the analysis of emotional models and its use in speech synthesis. The corpus consists of seven hundred semantically neutral sentences that were recorded for the Big Six emotions and neutral style, by two professional actors. The test results show that every emotion is readily recognized far above chance level for both speakers. Therefore the database is a valid linguistic resource for the research and development purposes it was designed for.

2004

pdf
Designing and Recording an Audiovisual Database of Emotional Speech in Basque
Eva Navas | Amaia Castelruiz | Iker Luengo | Jon Sánchez | Inmaculada Hernáez
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2002

pdf
BIZKAIFON: A sound archive of dialectal varieties of spoken Basque
I. Hernáez | E. Navas | J. Sánchez | I. Madariaga | I. Gaminde | X. Zalbide
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)