Kazumasa Yamamoto

2022

pdf bib abs
Elderly Conversational Speech Corpus with Cognitive Impairment Test and Pilot Dementia Detection Experiment Using Acoustic Characteristics of Speech in Japanese Dialects
Meiko Fukuda | Ryota Nishimura | Maina Umezawa | Kazumasa Yamamoto | Yurie Iribe | Norihide Kitaoka
Proceedings of the Thirteenth Language Resources and Evaluation Conference

There is a need for a simple method of detecting early signs of dementia which is not burdensome to patients, since early diagnosis and treatment can often slow the advance of the disease. Several studies have explored using only the acoustic and linguistic information of conversational speech as diagnostic material, with some success. To accelerate this research, we recorded natural conversations between 128 elderly people living in four different regions of Japan and interviewers, who also administered the Hasegawa’s Dementia Scale-Revised (HDS-R), a cognitive impairment test. Using our elderly speech corpus and dementia test results, we propose an SVM-based screening method which can detect dementia using the acoustic features of conversational speech even when regional dialects are present. We accomplish this by omitting some acoustic features, to limit the negative effect of differences between dialects. When using our proposed method, a dementia detection accuracy rate of about 91% was achieved for speakers from two regions. When speech from four regions was used in a second experiment, the discrimination rate fell to 76.6%, but this may have been due to using only sentence-level acoustic features in the second experiment, instead of sentence and phoneme-level features as in the previous experiment. This is an on-going research project, and additional investigation is needed to understand differences in the acoustic characteristics of phoneme units in the conversational speech collected from these four regions, to determine whether the removal of formants and other features can improve the dementia detection rate.

2008

Recently, speech recognition performance has been drastically improved by statistical methods and huge speech databases. Now performance improvement under such realistic environments as noisy conditions is being focused on. Since October 2001, we from the working group of the Information Processing Society in Japan have been working on evaluation methodologies and frameworks for Japanese noisy speech recognition. We have released frameworks including databases and evaluation tools called CENSREC-1 (Corpus and Environment for Noisy Speech RECognition 1; formerly AURORA-2J), CENSREC-2 (in-car connected digits recognition), CENSREC-3 (in-car isolated word recognition), and CENSREC-1-C (voice activity detection under noisy conditions). In this paper, we newly introduce a collection of databases and evaluation tools named CENSREC-4, which is an evaluation framework for distant-talking speech under hands-free conditions. Distant-talking speech recognition is crucial for a hands-free speech interface. Therefore, we measured room impulse responses to investigate reverberant speech recognition. The results of evaluation experiments proved that CENSREC-4 is an effective database suitable for evaluating the new dereverberation method because the traditional dereverberation process had difficulty sufficiently improving the recognition performance. The framework was released in March 2008, and many studies are being conducted with it in Japan.