Volodymyr Sydorskyi
2026
The UNLP 2026 Shared Task on Multi-Domain Document Understanding
Volodymyr Sydorskyi | Nataliia Romanyshyn | Roman Kyslyi | Olena Nahorna
Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026)
Volodymyr Sydorskyi | Nataliia Romanyshyn | Roman Kyslyi | Olena Nahorna
Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026)
This paper presents the results of the UNLP 2026 Shared Task on Multi-Domain Document Understanding. This Shared Task aims to challenge and assess AI capabilities to find the right information in a stack of domain-specific documents and generalize across domains. Participants were required not only to select the correct answer, but also to localize it by predicting the corresponding document and page. A total of 54 teams registered for the competition, 15 teams submitted systems, and 513 runs were evaluated on a hidden test set via Kaggle in a code-only submission format under constrained computational resources. The Kaggle leaderboard is left open for further submissions. Summarizing the contributions of this work, we establish a Ukrainian multi-domain document understanding benchmark, which consists of: (1) a collected dataset; (2) a proposed evaluation metric; and (3) an analysis of top-performing systems evaluated under a unified framework.
2025
The UNLP 2025 Shared Task on Detecting Social Media Manipulation
Roman Kyslyi | Nataliia Romanyshyn | Volodymyr Sydorskyi
Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025)
Roman Kyslyi | Nataliia Romanyshyn | Volodymyr Sydorskyi
Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025)
This paper presents the results of the UNLP 2025 Shared Task on Detecting Social Media Manipulation. The task included two tracks: Technique Classification and Span Identification. The benchmark dataset contains 9,557 posts from Ukrainian Telegram channels manually annotated by media experts. A total of 51 teams registered, 22 teams submitted systems, and 595 runs were evaluated on a hidden test set via Kaggle. Performance was measured with macro F1 for classification and token‐level F1 for identification. The shared task provides the first publicly available benchmark for manipulation detection in Ukrainian social media and highlights promising directions for low‐resource propaganda research. The Kaggle leaderboard is left open for further submissions.