Viktoriia Makovska
2026
Graph-Based Detection of Disinformation Narrative Diffusion between Russian and Ukrainian Telegram Channels
Yuliia Vistak | Viktoriia Makovska | Vera Schmitt | Veronika Solopova
Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026)
Yuliia Vistak | Viktoriia Makovska | Vera Schmitt | Veronika Solopova
Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026)
Detecting disinformation narratives on social media is challenging due to the scale of amplification, rapid evolution, and linguistic variability of online content. We propose a graph-based framework for identifying and analyzing disinformation narratives in Telegram ecosystems by combining weak supervision with propagation graph analysis. The approach aggregates semantically related claims into narrative-level clusters and models their diffusion across interconnected channels. This enables the detection of coordinated narrative amplification that is difficult to capture through post-level analysis alone. Our results demonstrate that integrating textual signals with network structure provides a scalable method for detecting disinformation narratives and offers insights into how they propagate within large-scale messaging environments.
Data-Efficient Adaptation of Multilingual LLMs to Ukrainian
Yurii Paniv | Bohdan Didenko | Mykola Haltiuk | Vladyslav Humennyy | Andrian Kravchenko | Roman Kyslyi | Viktoriia Makovska | Artem Orlovskyi | Bohdan Ruban | Maksym-Yurii Rudko | Anastasiia Senyk | Nazarii Drushchak | Dmytro Chaplynskyi | Mariana Romanyshyn
Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026)
Yurii Paniv | Bohdan Didenko | Mykola Haltiuk | Vladyslav Humennyy | Andrian Kravchenko | Roman Kyslyi | Viktoriia Makovska | Artem Orlovskyi | Bohdan Ruban | Maksym-Yurii Rudko | Anastasiia Senyk | Nazarii Drushchak | Dmytro Chaplynskyi | Mariana Romanyshyn
Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026)
Adapting large language models to low-resource languages presents three interconnected challenges: inefficient tokenization, scarcity of high-quality annotated data, and limited resources for instruction tuning. We present a reproducible approach that addresses each challenge using data-centric methods that primarily rely on unlabeled text corpora, parallel translation data, and a multilingual base model. Our approach combines (1) vocabulary surgery for tokenizer adaptation without full retraining, (2) cross-lingual transfer of quality classifiers via translation, enabling filtering without target-language annotations, and (3) generation of instruction data through translation, task conversion, and targeted synthesis. We validate this recipe by adapting Gemma-3-12B to Ukrainian. %, producing Lapa-12BOur pretrained model achieves top performance on Ukrainian benchmarks, while our instruction-tuned variant demonstrates strong performance on translation (33 BLEU on FLORES), summarization, and question-answering tasks, while requiring 1.5x fewer tokens than the original model for the same text. We release all models, datasets, classifiers, and code to enable replication for other languages.
Bridging Applied Experience and Research Contexts in Ukrainian NLP Education
Yurii Paniv | Viktoriia Makovska
Proceedings of the Seventh Workshop on Teaching Natural Language Processing (TeachNLP 2026)
Yurii Paniv | Viktoriia Makovska
Proceedings of the Seventh Workshop on Teaching Natural Language Processing (TeachNLP 2026)
We present an open, bachelor-level Natural Language Processing (NLP) course developed at Ukrainian Catholic University and delivered in Ukrainian. The course addresses several challenges in NLP education: adapting predominantly English-centric materials to a different linguistic and cultural context, supporting students with heterogeneous technical backgrounds, and balancing foundational theory with industry-relevant skills. All course materials, including lecture slides, notebooks, video recordings, and assignments, are publicly available. We describe our pedagogical design choices, focusing on culturally adapted tasks, integrated ethics, project-based assessment, and continuous student feedback. Our experience demonstrates that it is feasible to build a comprehensive and modern NLP curriculum from scratch in a non-English context, even when instructors come primarily from industry backgrounds.