The AUTONOMATA Spoken Names Corpus
Henk van den Heuvel, Jean-Pierre Martens, Bart D’hoore, Kristof D’hanens, Nanneke Konings
Abstract
In the Autonomata project we have collected a corpus of spoken name utterances with manually corrected phonemic transcriptions of these utterances. The corpus was designed with the intention to become a major resource for the development of automatic speech recognition engines that can achieve a high accuracy on the recognition of person and geographical names spoken in Dutch. The recorded names were selected so as to reveal the major pronunciation variations that a speech recognizer of e.g. a navigation system with speech input is going to be confronted with. This includes native speakers speaking foreign names and vice versa.- Anthology ID:
- L08-1476
- Volume:
- Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
- Month:
- May
- Year:
- 2008
- Address:
- Marrakech, Morocco
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2008/pdf/48_paper.pdf
- DOI:
- Cite (ACL):
- Henk van den Heuvel, Jean-Pierre Martens, Bart D’hoore, Kristof D’hanens, and Nanneke Konings. 2008. The AUTONOMATA Spoken Names Corpus. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
- Cite (Informal):
- The AUTONOMATA Spoken Names Corpus (van den Heuvel et al., LREC 2008)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2008/pdf/48_paper.pdf