Toward a Gold-Standard Benchmark for Evaluating Ukrainian Language Proficiency in LLMs
Svitlana Galeshchuk, Yuliia Maksymiuk, Yuliia Chernobrov, Nina Stankevych, Oleksandra Antoniv, Nataliia Faryna, Oksana Popkova
Abstract
The paper presents an expert-curated benchmark for assessing Ukrainian proficiency in LLMs, focusing on grammar and orthography as core components of language competence. Prepared by professional linguists, the proposed gold-standard dataset is designed to test normative Ukrainian usage.The benchmark is further used to evaluate a range of LLMs, including Ukrainian-focused, multilingual, and large-scale models, under zero-shot and few-shot prompting in Ukrainian and English. Across these settings, smaller models achieve no more than 42.1% accuracy, while large-scale LLMs reach up to 59.6%. These results show that standard Ukrainian remains challenging for current LLMs and highlight the need for stronger language-specific evaluation and adaptation.- Anthology ID:
- 2026.unlp-1.12
- Volume:
- Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026)
- Month:
- May
- Year:
- 2026
- Address:
- Lviv, Ukraine
- Editor:
- Mariana Romanyshyn
- Venue:
- UNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 121–135
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2026-06/2026.unlp-1.12/
- DOI:
- Cite (ACL):
- Svitlana Galeshchuk, Yuliia Maksymiuk, Yuliia Chernobrov, Nina Stankevych, Oleksandra Antoniv, Nataliia Faryna, and Oksana Popkova. 2026. Toward a Gold-Standard Benchmark for Evaluating Ukrainian Language Proficiency in LLMs. In Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026), pages 121–135, Lviv, Ukraine. Association for Computational Linguistics.
- Cite (Informal):
- Toward a Gold-Standard Benchmark for Evaluating Ukrainian Language Proficiency in LLMs (Galeshchuk et al., UNLP 2026)
- PDF:
- https://preview.aclanthology.org/corrections-2026-06/2026.unlp-1.12.pdf