Garnishing a phonetic dictionary for ASR intake
Iben Nyholm Debess, Sandra Saxov Lamhauge, Peter Juel Henrichsen
Abstract
We present a new method for preparing a lexical-phonetic database as a resource for acoustic model training. The research is an offshoot of the ongoing Project Ravnur (Speech Recognition for Faroese), but the method is language-independent. At NODALIDA 2019 we demonstrate the method (called SHARP) online, showing how a traditional lexical-phonetic dictionary (with a very rich phone inventory) is transformed into an ASR-friendly database (with reduced phonetics, preventing data sparseness). The mapping procedure is informed by a corpus of speech transcripts. We conclude with a discussion on the benefits of a well-thought-out BLARK design (Basic Language Resource Kit), making tools like SHARP possible.- Anthology ID:
- W19-6147
- Volume:
- Proceedings of the 22nd Nordic Conference on Computational Linguistics
- Month:
- September–October
- Year:
- 2019
- Address:
- Turku, Finland
- Editors:
- Mareike Hartmann, Barbara Plank
- Venue:
- NoDaLiDa
- SIG:
- Publisher:
- Linköping University Electronic Press
- Note:
- Pages:
- 395–399
- Language:
- URL:
- https://aclanthology.org/W19-6147
- DOI:
- Cite (ACL):
- Iben Nyholm Debess, Sandra Saxov Lamhauge, and Peter Juel Henrichsen. 2019. Garnishing a phonetic dictionary for ASR intake. In Proceedings of the 22nd Nordic Conference on Computational Linguistics, pages 395–399, Turku, Finland. Linköping University Electronic Press.
- Cite (Informal):
- Garnishing a phonetic dictionary for ASR intake (Debess et al., NoDaLiDa 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/W19-6147.pdf