Cross-lingual studies of ASR errors: paradigms for perceptual evaluations

Ioana Vasilescu, Martine Adda-Decker, Lori Lamel


Abstract
It is well-known that human listeners significantly outperform machines when it comes to transcribing speech. This paper presents a progress report of the joint research in the automatic vs human speech transcription and of the perceptual experiments developed at LIMSI that aims to increase our understanding of automatic speech recognition errors. Two paradigms are described here in which human listeners are asked to transcribe speech segments containing words that are frequently misrecognized by the system. In particular, we sought to gain information about the impact of increased context to help humans disambiguate problematic lexical items, typically homophone or near-homophone words. The long-term aim of this research is to improve the modeling of ambiguous contexts so as to reduce automatic transcription errors.
Anthology ID:
L12-1134
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3511–3518
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/300_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Ioana Vasilescu, Martine Adda-Decker, and Lori Lamel. 2012. Cross-lingual studies of ASR errors: paradigms for perceptual evaluations. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3511–3518, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Cross-lingual studies of ASR errors: paradigms for perceptual evaluations (Vasilescu et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/300_Paper.pdf