Louis ten Bosch


2010

pdf bib
A Speech Corpus for Modeling Language Acquisition: CAREGIVER
Toomas Altosaar | Louis ten Bosch | Guillaume Aimetti | Christos Koniaris | Kris Demuynck | Henk van den Heuvel
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

A multi-lingual speech corpus used for modeling language acquisition called CAREGIVER has been designed and recorded within the framework of the EU funded Acquisition of Communication and Recognition Skills (ACORNS) project. The paper describes the motivation behind the corpus and its design by relying on current knowledge regarding infant language acquisition. Instead of recording infants and children, the voices of their primary and secondary caregivers were captured in both infant-directed and adult-directed speech modes over four languages in a read speech manner. The challenges and methods applied to obtain similar prompts in terms of complexity and semantics across different languages, as well as the normalized recording procedures employed at different locations, is covered. The corpus contains nearly 66000 utterance based audio files spoken over a two-year period by 17 male and 17 female native speakers of Dutch, English, Finnish, and Swedish. An orthographical transcription is available for every utterance. Also, time-aligned word and phone annotations for many of the sub-corpora also exist. The CAREGIVER corpus will be published via ELRA.

2008

pdf bib
Evaluating the Relationship between Linguistic and Geographic Distances using a 3D Visualization
Folkert de Vriend | Jan Pieter Kunst | Louis ten Bosch | Charlotte Giesbers | Roeland van Hout
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we discuss how linguistic and geographic distances can be related using a 3D visualization. We will convert linguistic data for locations along the German-Dutch border to linguistic distances that can be compared directly to geographic distances. This enables us to visualize linguistic distances as “real” distances with the use of the third dimension available in 3D modelling software. With such a visualization we will test if descriptive dialect data support the hypothesis that the German-Dutch state border became a linguistic border between the German and Dutch dialects. Our visualization is implemented in the 3D modelling software SketchUp.