José Gregorio Escalada


2012

pdf
The I3MEDIA speech database: a trilingual annotated corpus for the analysis and synthesis of emotional speech
Juan María Garrido | Yesika Laplaza | Montse Marquina | Andrea Pearman | José Gregorio Escalada | Miguel Ángel Rodríguez | Ana Armenta
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this article the I3Media corpus is presented, a trilingual (Catalan, English, Spanish) speech database of neutral and emotional material collected for analysis and synthesis purposes. The corpus is actually made up of six different subsets of material: a neutral subcorpus, containing emotionless utterances; a ‘dialog' subcorpus, containing typical call center utterances; an ‘emotional' corpus, a set of sentences representative of pure emotional states; a ‘football' subcorpus, including utterances imitating a football broadcasting situation; a ‘SMS' subcorpus, including readings of SMS texts; and a ‘paralinguistic elements' corpus, including recordings of interjections and paralinguistic sounds uttered in isolation. The corpus was read by professional speakers (male, in the case of Spanish and Catalan; female, in the case of the English corpus), carefully selected to meet criteria of language competence, voice quality and acting conditions. It is the result of a collaboration between the Speech Technology Group at Telefónica Investigación y Desarrollo (TID) and the Speech and Language Group at Barcelona Media Centre d'Innovació (BM), as part of the I3Media project.