B-REASO: A Multi-Level Multi-Faceted Bengali Evaluation Suite for Foundation Models

Md Tanzib Hosain, Md Kishor Morol


Abstract
The fast growth of large language models (LLMs) necessitates the urgent need for new NLP benchmarks. We provide B-REASO, the first inclusive Bengali assessment suite created to evaluate advanced foundation model knowledge and reasoning skills in a Bengali language setup. The B-REASO includes multiple-choice questions with four different degrees of difficulty: professional, college, high school, and middle school. The questions cover 50 different fields, from science and engineering to the humanities. Alongside B-REASO, there is B-REASO HEAVY, a subset of extremely difficult B-REASO topics that need for sophisticated reasoning skills to answer. We do a thorough assessment of the most sophisticated LLMs on B-REASO, encompassing models with an English focus. Findings show that only Claude-3.5-Sonnet was able to get an average accuracy of more than 65%, indicating that contemporary LLMs still have a long way to go. We hope that B-REASO will support the creation and expansion of foundation models for Bengali users by assisting in the analysis of significant advantages and disadvantages of these models. We open-source our code and data at https://github.com/kraritt/b-reaso.
Anthology ID:
2025.findings-emnlp.492
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9260–9274
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.492/
DOI:
10.18653/v1/2025.findings-emnlp.492
Bibkey:
Cite (ACL):
Md Tanzib Hosain and Md Kishor Morol. 2025. B-REASO: A Multi-Level Multi-Faceted Bengali Evaluation Suite for Foundation Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 9260–9274, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
B-REASO: A Multi-Level Multi-Faceted Bengali Evaluation Suite for Foundation Models (Hosain & Morol, Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.492.pdf
Checklist:
 2025.findings-emnlp.492.checklist.pdf