Evaluating Repetitions, or how to Improve your Multilingual ASR System by doing Nothing

Marijn Schraagen, Gerrit Bloothooft


Abstract
Repetition is a common concept in human communication. This paper investigates possible benefits of repetition for automatic speech recognition under controlled conditions. Testing is performed on the newly created Autonomata TOO speech corpus, consisting of multilingual names for Points-Of-Interest as spoken by both native and non-native speakers. During corpus recording, ASR was being performed under baseline conditions using a Nuance Vocon 3200 system. On failed recognition, additional attempts for the same utterances were added to the corpus. Substantial improvements in recognition results are shown for all categories of speakers and utterances, even if speakers did not noticeably alter their previously misrecognized pronunciation. A categorization is proposed for various types of differences between utterance realisations. The number of attempts, the pronunciation of an utterance over multiple attempts compared to both previous attempts and reference pronunciation is analyzed for difference type and frequency. Variables such as the native language of the speaker and the languages in the lexicon are taken into account. Possible implications for ASR research are discussed.
Anthology ID:
L10-1205
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/295_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Marijn Schraagen and Gerrit Bloothooft. 2010. Evaluating Repetitions, or how to Improve your Multilingual ASR System by doing Nothing. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Evaluating Repetitions, or how to Improve your Multilingual ASR System by doing Nothing (Schraagen & Bloothooft, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/295_Paper.pdf