A Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus

Mauro Nicolao, Heidi Christensen, Stuart Cunningham, Phil Green, Thomas Hain


Abstract
This paper introduces a new British English speech database, named the homeService corpus, which has been gathered as part of the homeService project. This project aims to help users with speech and motor disabilities to operate their home appliances using voice commands. The audio recorded during such interactions consists of realistic data of speakers with severe dysarthria. The majority of the homeService corpus is recorded in real home environments where voice control is often the normal means by which users interact with their devices. The collection of the corpus is motivated by the shortage of realistic dysarthric speech corpora available to the scientific community. Along with the details on how the data is organised and how it can be accessed, a brief description of the framework used to make the recordings is provided. Finally, the performance of the homeService automatic recogniser for dysarthric speech trained with single-speaker data from the corpus is provided as an initial baseline. Access to the homeService corpus is provided through the dedicated web page at http://mini.dcs.shef.ac.uk/resources/homeservice-corpus/. This will also have the most updated description of the data. At the time of writing the collection process is still ongoing.
Anthology ID:
L16-1315
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1993–1997
Language:
URL:
https://aclanthology.org/L16-1315
DOI:
Bibkey:
Cite (ACL):
Mauro Nicolao, Heidi Christensen, Stuart Cunningham, Phil Green, and Thomas Hain. 2016. A Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1993–1997, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
A Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus (Nicolao et al., LREC 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/L16-1315.pdf