MizanQA: A Benchmark for Multi-Answer Moroccan Legal QA

Adil Bahaj, Mounir Ghogho


Abstract
We present MizanQA, a benchmark for assessing LLMs on Moroccan legal MCQs, many with multiple correct answers. Covering 1,776 expert-verified questions in Modern Standard Arabic enriched with Moroccan idioms, the dataset reflects influences from Maliki jurisprudence, customary law, and French legal traditions. Unlike single-answer settings, MizanQA features variable option counts, creating added difficulty. We evaluate multilingual and Arabic-centric models in zero-shot, native-Arabic prompts, measuring accuracy, a precision-penalized F1-like score, and calibration errors. Results show large performance gaps and miscalibration, particularly under stricter penalties. By scoping this benchmark to parametric knowledge only, we provide a baseline for future retrieval-augmented and rationale-focused setups.
Anthology ID:
2026.eacl-industry.10
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Yevgen Matusevych, Gülşen Eryiğit, Nikolaos Aletras
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
132–144
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.10/
DOI:
Bibkey:
Cite (ACL):
Adil Bahaj and Mounir Ghogho. 2026. MizanQA: A Benchmark for Multi-Answer Moroccan Legal QA. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track), pages 132–144, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
MizanQA: A Benchmark for Multi-Answer Moroccan Legal QA (Bahaj & Ghogho, EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.10.pdf