RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus

Tiberiu Boroș, Adriana Stan, Oliver Watts, Stefan Daniel Dumitrescu


Abstract
This paper introduces a recent development of a Romanian Speech corpus to include prosodic annotations of the speech data in the form of ToBI labels. We describe the methodology of determining the required pitch patterns that are common for the Romanian language, annotate the speech resource, and then provide a comparison of two text-to-speech synthesis systems to establish the benefits of using this type of information to our speech resource. The result is a publicly available speech dataset which can be used to further develop speech synthesis systems or to automatically learn the prediction of ToBI labels from text in Romanian language.
Anthology ID:
L14-1569
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
316–320
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/727_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Tiberiu Boroș, Adriana Stan, Oliver Watts, and Stefan Daniel Dumitrescu. 2014. RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 316–320, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus (Boroș et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/727_Paper.pdf