Abstract
Forced alignment is an effective process to speed up linguistic research. However, most forced aligners are language-dependent, and under-resourced languages rarely have enough resources to train an acoustic model for an aligner. We present a new Finnish grapheme-based forced aligner and demonstrate its performance by aligning multiple Uralic languages and English as an unrelated language. We show that even a simple non-expert created grapheme-to-phoneme mapping can result in useful word alignments.- Anthology ID:
- 2021.nodalida-main.36
- Volume:
- Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
- Month:
- May 31--2 June
- Year:
- 2021
- Address:
- Reykjavik, Iceland (Online)
- Editors:
- Simon Dobnik, Lilja Øvrelid
- Venue:
- NoDaLiDa
- SIG:
- Publisher:
- Linköping University Electronic Press, Sweden
- Note:
- Pages:
- 345–350
- Language:
- URL:
- https://aclanthology.org/2021.nodalida-main.36
- DOI:
- Cite (ACL):
- Juho Leinonen, Sami Virpioja, and Mikko Kurimo. 2021. Grapheme-Based Cross-Language Forced Alignment: Results with Uralic Languages. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pages 345–350, Reykjavik, Iceland (Online). Linköping University Electronic Press, Sweden.
- Cite (Informal):
- Grapheme-Based Cross-Language Forced Alignment: Results with Uralic Languages (Leinonen et al., NoDaLiDa 2021)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/2021.nodalida-main.36.pdf
- Code
- aalto-speech/finnish-forced-alignment
- Data
- LibriSpeech