Abstract
Current continuous speech recognition systems essentially ignore unknown words. Systems are designed to recognize words in the lexicon. However, for using speech recognition systems in real applications of spoken-language processing, it is very important to process unknown words. This paper proposes a continuous speech recognition method which accepts any utterance that might include unknown words. In this method, words not in the lexicon are transcribed as phone sequences, while words in the lexicon are recognized correctly. The HMM-LR speech recognition system, which is an integration of Hidden Markov Models and generalized LR parsing, is used as the baseline system, and enhanced with the trigram model of syllables to take into account the stochastic characteristics of a language. Preliminary results indicate that our approach is very promising.- Anthology ID:
- 1991.iwpt-1.16
- Volume:
- Proceedings of the Second International Workshop on Parsing Technologies
- Month:
- February 13-25
- Year:
- 1991
- Address:
- Cancun, Mexico
- Venue:
- IWPT
- SIG:
- SIGPARSE
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 136–142
- Language:
- URL:
- https://aclanthology.org/1991.iwpt-1.16
- DOI:
- Cite (ACL):
- Kenji Kita, Terumasa Ehara, and Tsuyoshi Morimoto. 1991. Processing Unknown Words in Continuous Speech Recognition. In Proceedings of the Second International Workshop on Parsing Technologies, pages 136–142, Cancun, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Processing Unknown Words in Continuous Speech Recognition (Kita et al., IWPT 1991)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/1991.iwpt-1.16.pdf