Standardising Pronunciation for a Grapheme-to-Phoneme Converter for Faroese
Sandra Lamhauge, Iben Debess, Carlos Hernández Mena, Annika Simonsen, Jon Gudnason
Abstract
Pronunciation dictionaries allow computational modelling of the pronunciation of words in a certain language and are widely used in speech technologies, especially in the fields of speech recognition and synthesis. On the other hand, a grapheme-to-phoneme tool is a generalization of a pronunciation dictionary that is not limited to a given and finite vocabulary. In this paper, we present a set of standardized phonological rules for the Faroese language; we introduce FARSAMPA, a machine-readable character set suitable for phonetic transcription of Faroese, and we present a set of grapheme-to-phoneme models for Faroese, which are publicly available and shared under a creative commons license. We present the G2P converter and evaluate the performance. The evaluation shows reliable results that demonstrate the quality of the data.- Anthology ID:
- 2023.nodalida-1.32
- Volume:
- Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
- Month:
- May
- Year:
- 2023
- Address:
- Tórshavn, Faroe Islands
- Editors:
- Tanel Alumäe, Mark Fishel
- Venue:
- NoDaLiDa
- SIG:
- Publisher:
- University of Tartu Library
- Note:
- Pages:
- 308–317
- Language:
- URL:
- https://aclanthology.org/2023.nodalida-1.32
- DOI:
- Cite (ACL):
- Sandra Lamhauge, Iben Debess, Carlos Hernández Mena, Annika Simonsen, and Jon Gudnason. 2023. Standardising Pronunciation for a Grapheme-to-Phoneme Converter for Faroese. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 308–317, Tórshavn, Faroe Islands. University of Tartu Library.
- Cite (Informal):
- Standardising Pronunciation for a Grapheme-to-Phoneme Converter for Faroese (Lamhauge et al., NoDaLiDa 2023)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2023.nodalida-1.32.pdf