Abstract
The main aim of this study is to describe the process of creating a speech database to be used in corpus based text-to-speech synthesis. To help achieve natural sounding speech synthesis, the database construction was aimed at rich phonetic and prosodic coverage based on variable length units (phoneme, diphone, triphone) from different phonetic and prosodic contexts. Following previous work on determining the optimal coverage (Szklanny and Oliver, 2005), text selection was based on the existing text corpus containing parliamentary statements. Corpus balancing was followed by recording of the material. Automatic segmentation was performed, followed by both an automatic and manual check of the data to determine speaker specific phenomena and correct the labelling. Additionally, prosodic annotation involving assignment of the intonation contours was performed in order to assess the accent realisation and determine the prosodic coverage of the database. The prototype speech synthesiser was built to determine the validity of the above steps and test the resulting voice quality.- Anthology ID:
- L06-1428
- Volume:
- Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
- Month:
- May
- Year:
- 2006
- Address:
- Genoa, Italy
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/688_pdf.pdf
- DOI:
- Cite (ACL):
- Dominika Oliver and Krzysztof Szklanny. 2006. Creation and analysis of a Polish speech database for use in unit selection synthesis. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
- Cite (Informal):
- Creation and analysis of a Polish speech database for use in unit selection synthesis (Oliver & Szklanny, LREC 2006)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/688_pdf.pdf