Viktoriia Makovska


2026

Detecting disinformation narratives on social media is challenging due to the scale of amplification, rapid evolution, and linguistic variability of online content. We propose a graph-based framework for identifying and analyzing disinformation narratives in Telegram ecosystems by combining weak supervision with propagation graph analysis. The approach aggregates semantically related claims into narrative-level clusters and models their diffusion across interconnected channels. This enables the detection of coordinated narrative amplification that is difficult to capture through post-level analysis alone. Our results demonstrate that integrating textual signals with network structure provides a scalable method for detecting disinformation narratives and offers insights into how they propagate within large-scale messaging environments.
Adapting large language models to low-resource languages presents three interconnected challenges: inefficient tokenization, scarcity of high-quality annotated data, and limited resources for instruction tuning. We present a reproducible approach that addresses each challenge using data-centric methods that primarily rely on unlabeled text corpora, parallel translation data, and a multilingual base model. Our approach combines (1) vocabulary surgery for tokenizer adaptation without full retraining, (2) cross-lingual transfer of quality classifiers via translation, enabling filtering without target-language annotations, and (3) generation of instruction data through translation, task conversion, and targeted synthesis. We validate this recipe by adapting Gemma-3-12B to Ukrainian. %, producing Lapa-12BOur pretrained model achieves top performance on Ukrainian benchmarks, while our instruction-tuned variant demonstrates strong performance on translation (33 BLEU on FLORES), summarization, and question-answering tasks, while requiring 1.5x fewer tokens than the original model for the same text. We release all models, datasets, classifiers, and code to enable replication for other languages.
We present an open, bachelor-level Natural Language Processing (NLP) course developed at Ukrainian Catholic University and delivered in Ukrainian. The course addresses several challenges in NLP education: adapting predominantly English-centric materials to a different linguistic and cultural context, supporting students with heterogeneous technical backgrounds, and balancing foundational theory with industry-relevant skills. All course materials, including lecture slides, notebooks, video recordings, and assignments, are publicly available. We describe our pedagogical design choices, focusing on culturally adapted tasks, integrated ethics, project-based assessment, and continuous student feedback. Our experience demonstrates that it is feasible to build a comprehensive and modern NLP curriculum from scratch in a non-English context, even when instructors come primarily from industry backgrounds.