Oleksii Ignatenko
2025
Precision vs. Perturbation: Robustness Analysis of Synonym Attacks in Ukrainian NLP
Volodymyr Mudryi
|
Oleksii Ignatenko
Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025)
Synonym-based adversarial tests reveal fragile word patterns that accuracy metrics overlook, while virtually no such diagnostics exist for Ukrainian, a morphologically rich and low‐resource language. We present the first systematic robustness evaluation under synonym substitution in Ukrainian. Adapting TextFooler and BERT‐Attack to Ukrainian, we (i) adjust a 15000‐entry synonym dictionary to match proper word forms; (ii) integrate similarity filters; (iii) adapt masked‐LM search so it generates only valid inflected words. Across three text classification datasets (reviews, news headlines, social‐media manipulation) and three transformer models (Ukr‐RoBERTa, XLM‐RoBERTa, SBERT), single‐word swaps reduce accuracy by up to 12.6, while multi‐step attacks degrade performance by as much as 40.27 with around 112 model queries. A few‐shot transfer test shows GPT‐4o, a state‐of‐the‐art multilingual LLM, still suffers 6.9–15.0 drops on the same adversarial samples. Our results underscore the need for sense‐aware, morphology‐constrained synonym resources and provide a reproducible benchmark for future robustness research in Ukrainian NLP.
2024
Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024
Mariana Romanyshyn
|
Nataliia Romanyshyn
|
Andrii Hlybovets
|
Oleksii Ignatenko
Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024