HESITA(te) in Portuguese
Sara Candeias, Dirce Celorico, Jorge Proença, Arlindo Veiga, Carla Lopes, Fernando Perdigão
Abstract
Hesitations, so-called disfluencies, are a characteristic of spontaneous speech, playing a primary role in its structure, reflecting aspects of the language production and the management of inter-communication. In this paper we intend to present a database of hesitations in European Portuguese speech - HESITA - as a relevant base of work to study a variety of speech phenomena. Patterns of hesitations, hesitation distribution according to speaking style, and phonetic properties of the fillers are some of the characteristics we extrapolated from the HESITA database. This database also represents an important resource for improvement in synthetic speech naturalness as well as in robust acoustic modelling for automatic speech recognition. The HESITA database is the output of a project in the speech-processing field for European Portuguese held by an interdisciplinary group in intimate articulation between engineering tools and experience and the linguistic approach.- Anthology ID:
- L14-1473
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 1564–1567
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/587_Paper.pdf
- DOI:
- Cite (ACL):
- Sara Candeias, Dirce Celorico, Jorge Proença, Arlindo Veiga, Carla Lopes, and Fernando Perdigão. 2014. HESITA(te) in Portuguese. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 1564–1567, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- HESITA(te) in Portuguese (Candeias et al., LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/587_Paper.pdf