Anastasiia Senyk


2025

pdf bib
Context-Aware Lexical Stress Prediction and Phonemization for Ukrainian TTS Systems
Anastasiia Senyk | Mykhailo Lukianchuk | Valentyna Robeiko | Yurii Paniv
Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025)

Text preprocessing is a fundamental component of high-quality speech synthesis. This work presents a novel rule-based phonemizer combined with a sentence-level lexical stress prediction model to improve phonetic accuracy and prosody prediction in the text-to-speech pipelines. We also introduce a new benchmark dataset with annotated stress patterns designed for evaluating lexical stress prediction systems at the sentence level.Experimental results demonstrate that the proposed phonemizer achieves a 1.23% word error rate on a manually constructed pronunciation dataset, while the lexical stress prediction pipeline shows results close to dictionary-based methods, outperforming existing neural network solutions.