Bulgarian Massive Multitask Language Understanding Benchmark

Svetla Peneva Koeva; Ivelina Stoyanova; Dimiter Georgiev; Svetlozara Leseva; Valentina Stefanova; Maria Todorova; Tsvetana Ivanova Dimitrova; Hristina Kukova; Mihaela Moskova; Tinko Tinchev

Bulgarian Massive Multitask Language Understanding Benchmark

Svetla Peneva Koeva, Ivelina Stoyanova, Dimiter Georgiev, Svetlozara Leseva, Valentina Stefanova, Maria Todorova, Tsvetana Ivanova Dimitrova, Hristina Kukova, Mihaela Moskova, Tinko Tinchev

Abstract

Assessing the broad general knowledge of Large Language Models (LLMs) across multiple domains in Bulgarian remains challenging due to the limited availability of Bulgarian evaluation benchmarks. To address this gap, we introduce the Bulgarian Massive Multitask Language Understanding benchmark (MMLU-BG), designed to evaluate whether LLMs possess generalised knowledge capabilities beyond simple text prediction in Bulgarian. This paper presents the structure, the development protocol, and the size of the MMLU-BG benchmark. It is tested in comparison with the original MMLU for English across seven LLMs selected according to specific criteria. The experiments demonstrate that the MMLU-BG benchmark assesses multi-domain versatility and highlights the models’ strengths and weaknesses across different subject areas.

Anthology ID:: 2026.lrec-main.366
Volume:: Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:: May
Year:: 2026
Address:: Palma de Mallorca, Spain
Editors:: Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:: LREC
SIG:
Publisher:: ELRA Language Resource Association
Note:
Pages:: 4658–4672
Language:
URL:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.366/
DOI:
Bibkey:
Cite (ACL):: Svetla Peneva Koeva, Ivelina Stoyanova, Dimiter Georgiev, Svetlozara Leseva, Valentina Stefanova, Maria Todorova, Tsvetana Ivanova Dimitrova, Hristina Kukova, Mihaela Moskova, and Tinko Tinchev. 2026. Bulgarian Massive Multitask Language Understanding Benchmark. International Conference on Language Resources and Evaluation, main:4658–4672.
Cite (Informal):: Bulgarian Massive Multitask Language Understanding Benchmark (Koeva et al., LREC 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.366.pdf

PDF Cite Search Fix data