Yurii Paniv


2025

pdf bib
Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains
Yurii Paniv | Artur Kiulian | Dmytro Chaplynskyi | Mykola Khandoga | Anton Polishko | Tetiana Bas | Guillermo Gabrielli
Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025)

While the evaluation of multimodal English-centric models is an active area of research with numerous benchmarks, there is a profound lack of benchmarks or evaluation suites for low- and mid-resource languages. We introduce ZNO-Vision, a comprehensive multimodal Ukrainian-centric benchmark derived from the standardized university entrance examination (ZNO). The benchmark consists of over 4300 expert-crafted questions spanning 12 academic disciplines, including mathematics, physics, chemistry, and humanities. We evaluated the performance of both open-source models and API providers, finding that only a handful of models performed above baseline. Alongside the new benchmark, we performed the first evaluation study of multimodal text generation for the Ukrainian language: we measured caption generation quality on the Multi30K-UK dataset. Lastly, we tested a few models from a cultural perspective on knowledge of national cuisine. We believe our work will advance multimodal generation capabilities for the Ukrainian language and our approach could be useful for other low-resource languages.

pdf bib
UAlign: LLM Alignment Benchmark for the Ukrainian Language
Andrian Kravchenko | Yurii Paniv | Nazarii Drushchak
Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025)

This paper introduces UAlign, the comprehensive benchmark for evaluating the alignment of Large Language Models (LLMs) in the Ukrainian language. The benchmark consists of two complementary components: a moral judgment dataset with 3,682 scenarios of varying ethical complexities and a dataset with 1,700 ethical situations presenting clear normative distinctions. Each element provides parallel English-Ukrainian text pairs, enabling cross-lingual comparison. Unlike existing resources predominantly developed for high-resource languages, our benchmark addresses the critical need for evaluation resources in Ukrainian. The development process involved machine translation and linguistic validation using Ukrainian language models for grammatical error correction. Our cross-lingual evaluation of six LLMs confirmed the existence of a performance gap between alignment in Ukrainian and English while simultaneously providing valuable insights regarding the overall alignment capabilities of these models. The benchmark has been made publicly available to facilitate further research initiatives and enhance commercial applications.Warning: The datasets introduced in this paper contain sensitive materials related to ethical and moral scenarios that may include offensive, harmful, illegal, or controversial content.

pdf bib
Context-Aware Lexical Stress Prediction and Phonemization for Ukrainian TTS Systems
Anastasiia Senyk | Mykhailo Lukianchuk | Valentyna Robeiko | Yurii Paniv
Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025)

Text preprocessing is a fundamental component of high-quality speech synthesis. This work presents a novel rule-based phonemizer combined with a sentence-level lexical stress prediction model to improve phonetic accuracy and prosody prediction in the text-to-speech pipelines. We also introduce a new benchmark dataset with annotated stress patterns designed for evaluating lexical stress prediction systems at the sentence level.Experimental results demonstrate that the proposed phonemizer achieves a 1.23% word error rate on a manually constructed pronunciation dataset, while the lexical stress prediction pipeline shows results close to dictionary-based methods, outperforming existing neural network solutions.

2024

pdf bib
Setting up the Data Printer with Improved English to Ukrainian Machine Translation
Yurii Paniv | Dmytro Chaplynskyi | Nikita Trynus | Volodymyr Kyrylov
Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024

To build large language models for Ukrainian we need to expand our corpora with large amounts of new algorithmic tasks expressed in natural language. Examples of task performance expressed in English are abundant, so with a high-quality translation system our community will be enabled to curate datasets faster. To aid this goal, we introduce a recipe to build a translation system using supervised finetuning of a large pretrained language model with a noisy parallel dataset of 3M pairs of Ukrainian and English sentences followed by a second phase of training using 17K examples selected by k-fold perplexity filtering on another dataset of higher quality. Our decoder-only model named Dragoman beats performance of previous state of the art encoder-decoder models on the FLORES devtest set.