Oleksii Ignatenko


2025

Synonym-based adversarial tests reveal fragile word patterns that accuracy metrics overlook, while virtually no such diagnostics exist for Ukrainian, a morphologically rich and low‐resource language. We present the first systematic robustness evaluation under synonym substitution in Ukrainian. Adapting TextFooler and BERT‐Attack to Ukrainian, we (i) adjust a 15000‐entry synonym dictionary to match proper word forms; (ii) integrate similarity filters; (iii) adapt masked‐LM search so it generates only valid inflected words. Across three text classification datasets (reviews, news headlines, social‐media manipulation) and three transformer models (Ukr‐RoBERTa, XLM‐RoBERTa, SBERT), single‐word swaps reduce accuracy by up to 12.6, while multi‐step attacks degrade performance by as much as 40.27 with around 112 model queries. A few‐shot transfer test shows GPT‐4o, a state‐of‐the‐art multilingual LLM, still suffers 6.9–15.0 drops on the same adversarial samples. Our results underscore the need for sense‐aware, morphology‐constrained synonym resources and provide a reproducible benchmark for future robustness research in Ukrainian NLP.

2024