Benchmarking Mathematical Reasoning in a Low-Resource Language: Structured Prompting and Evaluation in Basque

Inigo Martinez-Criado; Aitor Soroa; Jeremy Barnes

Benchmarking Mathematical Reasoning in a Low-Resource Language: Structured Prompting and Evaluation in Basque

Inigo Martinez-Criado, Aitor Soroa, Jeremy Barnes

Abstract

Large Language Models (LLMs) have shown impressive performance on tasks requiring complex reasoning, but most evaluations tend to focus on English and other high-resource languages. This work investigates how well LLMs perform mathematical reasoning in low-resource languages, using Basque as a primary case study. To support this analysis, we introduce MASEU, a benchmark designed to evaluate reasoning in Basque across arithmetic, algebraic, and logical tasks. We then use this dataset to address three key questions: 1) how well do LLMs support Basque in reasoning tasks, 2) to what extent can including English in prompts improve results, and 3) what is the effect of continued pretraining in Basque? To explore these aspects, we use prompting strategies adapted for mathematical reasoning, building upon the foundations of CoT prompting and one of its subsequent evolutions, DUP prompting, which together allow for more precise experimentation across zero-shot and few-shot settings, providing insights into how multilingual models handle reasoning tasks in underrepresented languages.

Anthology ID:: 2026.lrec-main.412
Volume:: Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:: May
Year:: 2026
Address:: Palma de Mallorca, Spain
Editors:: Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:: LREC
SIG:
Publisher:: ELRA Language Resource Association
Note:
Pages:: 5268–5289
Language:
URL:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.412/
DOI:
Bibkey:
Cite (ACL):: Inigo Martinez-Criado, Aitor Soroa, and Jeremy Barnes. 2026. Benchmarking Mathematical Reasoning in a Low-Resource Language: Structured Prompting and Evaluation in Basque. International Conference on Language Resources and Evaluation, main:5268–5289.
Cite (Informal):: Benchmarking Mathematical Reasoning in a Low-Resource Language: Structured Prompting and Evaluation in Basque (Martinez-Criado et al., LREC 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.412.pdf
Optionalsupplementarymaterial:: 2026.lrec-main.412.OptionalSupplementaryMaterial.zip

PDF Cite Search Optionalsupplementarymaterial Fix data