Mihaela Moskova
2026
Bulgarian Massive Multitask Language Understanding Benchmark
Svetla Peneva Koeva | Ivelina Stoyanova | Dimiter Georgiev | Svetlozara Leseva | Valentina Stefanova | Maria Todorova | Tsvetana Ivanova Dimitrova | Hristina Kukova | Mihaela Moskova | Tinko Tinchev
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Svetla Peneva Koeva | Ivelina Stoyanova | Dimiter Georgiev | Svetlozara Leseva | Valentina Stefanova | Maria Todorova | Tsvetana Ivanova Dimitrova | Hristina Kukova | Mihaela Moskova | Tinko Tinchev
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Assessing the broad general knowledge of Large Language Models (LLMs) across multiple domains in Bulgarian remains challenging due to the limited availability of Bulgarian evaluation benchmarks. To address this gap, we introduce the Bulgarian Massive Multitask Language Understanding benchmark (MMLU-BG), designed to evaluate whether LLMs possess generalised knowledge capabilities beyond simple text prediction in Bulgarian. This paper presents the structure, the development protocol, and the size of the MMLU-BG benchmark. It is tested in comparison with the original MMLU for English across seven LLMs selected according to specific criteria. The experiments demonstrate that the MMLU-BG benchmark assesses multi-domain versatility and highlights the models’ strengths and weaknesses across different subject areas.