Creation of Learner Corpus and Its Application to Speech Recognition

Hiroki Yamazaki; Keisuke Kitamura; Takashi Harada; Seiichi Yamamoto

Creation of Learner Corpus and Its Application to Speech Recognition

Hiroki Yamazaki, Keisuke Kitamura, Takashi Harada, Seiichi Yamamoto

Abstract

Some big languages like English are spoken by a lot of people whose mother tongues are different from. Their second languages often have not only distinct accent but also different lexical and syntactic characteristics. Speech recognition performance is severely affected when the lexical, syntactic, or semantic characteristics in the training and recognition tasks differ. Language model of a speech recognition system is usually trained with transcribed speech data or text data collected in English native countries, therefore, speech recognition performance is expected to be degraded by mismatch of lexical and syntactic characteristics between native speakers and second language speakers as well as the distinction between their accents. The aim of language model adaptation is to exploit specific, albeit limited, knowledge about the recognition task to compensate for mismatch of the lexical, syntactic, or semantic characteristics. This paper describes whether the language model adaptation is effective for compensating for the mismatch between the lexical, syntactic, or semantic characteristics of native speakers and second language speakers.

Anthology ID:: L08-1520
Volume:: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:: May
Year:: 2008
Address:: Marrakech, Morocco
Editors:: Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:: LREC
SIG:
Publisher:: European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:: http://www.lrec-conf.org/proceedings/lrec2008/pdf/39_paper.pdf
DOI:
Bibkey:
Cite (ACL):: Hiroki Yamazaki, Keisuke Kitamura, Takashi Harada, and Seiichi Yamamoto. 2008. Creation of Learner Corpus and Its Application to Speech Recognition. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):: Creation of Learner Corpus and Its Application to Speech Recognition (Yamazaki et al., LREC 2008)
Copy Citation:
PDF:: http://www.lrec-conf.org/proceedings/lrec2008/pdf/39_paper.pdf

PDF Cite Search Fix data