Kristof D’hanens


2008

pdf
The AUTONOMATA Spoken Names Corpus
Henk van den Heuvel | Jean-Pierre Martens | Bart D’hoore | Kristof D’hanens | Nanneke Konings
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In the Autonomata project we have collected a corpus of spoken name utterances with manually corrected phonemic transcriptions of these utterances. The corpus was designed with the intention to become a major resource for the development of automatic speech recognition engines that can achieve a high accuracy on the recognition of person and geographical names spoken in Dutch. The recorded names were selected so as to reveal the major pronunciation variations that a speech recognizer of e.g. a navigation system with speech input is going to be confronted with. This includes native speakers speaking foreign names and vice versa.