Petr Pollák

Also published as: Petr Pollak


2014

pdf
The Nijmegen Corpus of Casual Czech
Mirjam Ernestus | Lucie Kočková-Amortová | Petr Pollak
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which contains more than 30 hours of high-quality recordings of casual conversations in Common Czech, among ten groups of three male and ten groups of three female friends. All speakers were native speakers of Czech, raised in Prague or in the region of Central Bohemia, and were between 19 and 26 years old. Every group of speakers consisted of one confederate, who was instructed to keep the conversations lively, and two speakers naive to the purposes of the recordings. The naive speakers were engaged in conversations for approximately 90 minutes, while the confederate joined them for approximately the last 72 minutes. The corpus was orthographically annotated by experienced transcribers and this orthographic transcription was aligned with the speech signal. In addition, the conversations were videotaped. This corpus can form the basis for all types of research on casual conversations in Czech, including phonetic research and research on how to improve automatic speech recognition. The corpus will be freely available.

2010

pdf
Multi-Channel Database of Spontaneous Czech with Synchronization of Channels Recorded by Independent Devices
Petr Pollák | Josef Rajnoha
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper describes Czech spontaneous speech database of lectures on digital signal processing topic collected at Czech Technical University in Prague, commonly with the procedure of its recording and annotation. The database contains 21.7 hours of speech material from 22 speakers recorded in 4 channels with 3 principally different microphones. The annotation of the database is composed from basic time segmentation, orthographic transcription including marks for speaker and environmental non-speech events, pronunciation lexicon in SAMPA alphabet, session and speaker information describing recording conditions, and the documentation. The orthographic transcription with time segmentation is saved in XML format supported by frequently used annotation tool Transcriber. In this article, special attention is also paid to the description of time synchronization of signals recorded by two independent devices: computer based recording platform using two external sound cards and commercial audio recorder Edirol R09. This synchronization is based on cross-correlation analysis with simple automated selection of suitable short signal subparts. The collection and annotation of this database is now complete and its availability via ELRA is currently under preparation.

2008

pdf
Phone Segmentation Tool with Integrated Pronunciation Lexicon and Czech Phonetically Labelled Reference Database.
Petr Pollák | Jan Volín | Radek Skarnitzl
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Phonetic segmentation is the procedure which is used in many applications of speech processing, both as a subpart of automated systems or as the tool for an interactive work. In this paper we are presenting the latest development in our tool of automated phonetic segmentation. The tool is based on HMM forced alignment realized by publicly available HTK toolkit. It is implemented into the environment of Praat application and it can be used with several optional settings. The tool is designed for segmentation of the utterances with known orthographic records while phonetic contents are obtained from the pronunciation lexicon or from orthoepic record generated by rules for new unknown words. Second part of this paper describes small Czech reference database precisely labelled on phonetic level which is supposed to be used for the analysis of the accuracy of automatic phonetic segmentation.

2006

pdf
Methodology of Lombard Speech Database Acquisition: Experiences with CLSD
Hynek Bořil | Tomáš Bořil | Petr Pollák
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper, process of the Czech Lombard Speech Database (CLSD'05) acquisition is presented. Feature analyses have proven a strong appearance of Lombard effect in the database. In the small vocabulary recognition task, significant performance degradation was observed for the Lombard speech recorded in the database. Aim of this paper is to describe the hardware platform, scenarios and recording tool used for the acquisition of CLSD'05. During the database recording and processing, several difficulties were encountered. The most important question was how to adjust the level of speech feedback for the speaker. A method for minimization of the speech attenuation introduced to the speaker by headphones is proposed in this paper. Finally, contents and corpus of the database are presented to outline it's suitability for analysis and modeling of Lombard effect. The whole CLSD'05 database with a detailed documentation is now released for public use.

2004

pdf
Orthographic and Phonetic Annotation of Very Large Czech Corpora with Quality Assessment
Petr Pollák | Jan Černocký
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2002

pdf
Tool for Czech Pronunciation Generation Combining Fixed Rules with Pronunciation Lexicon and Lexicon Management Tool
Petr Pollák | Václav Hanžl
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)