HESITA(te) in Portuguese

Sara Candeias, Dirce Celorico, Jorge Proença, Arlindo Veiga, Carla Lopes, Fernando Perdigão


Abstract
Hesitations, so-called disfluencies, are a characteristic of spontaneous speech, playing a primary role in its structure, reflecting aspects of the language production and the management of inter-communication. In this paper we intend to present a database of hesitations in European Portuguese speech - HESITA - as a relevant base of work to study a variety of speech phenomena. Patterns of hesitations, hesitation distribution according to speaking style, and phonetic properties of the fillers are some of the characteristics we extrapolated from the HESITA database. This database also represents an important resource for improvement in synthetic speech naturalness as well as in robust acoustic modelling for automatic speech recognition. The HESITA database is the output of a project in the speech-processing field for European Portuguese held by an interdisciplinary group in intimate articulation between engineering tools and experience and the linguistic approach.
Anthology ID:
L14-1473
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1564–1567
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/587_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Sara Candeias, Dirce Celorico, Jorge Proença, Arlindo Veiga, Carla Lopes, and Fernando Perdigão. 2014. HESITA(te) in Portuguese. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 1564–1567, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
HESITA(te) in Portuguese (Candeias et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/587_Paper.pdf