MultiBLiMP 1.0: A Massively Multilingual Benchmark of Linguistic Minimal Pairs
Jaap Jumelet, Leonie Weissweiler, Joakim Nivre, Arianna Bisazza
Abstract
We introduce MultiBLiMP 1.0, a massively multilingual benchmark of linguistic minimal pairs, covering 101 languages and 2 types of subject-verb agreement, containing more than 128,000 minimal pairs. Our minimal pairs are created using a fully automated pipeline, leveraging the large-scale linguistic resources of Universal Dependencies and UniMorph. MultiBLiMP 1.0 evaluates abilities of LLMs at an unprecedented multilingual scale, and highlights the shortcomings of the current state-of-the-art in modelling low-resource languages.1- Anthology ID:
- 2026.tacl-1.10
- Volume:
- Transactions of the Association for Computational Linguistics, Volume 14
- Month:
- Year:
- 2026
- Address:
- Cambridge, MA
- Venue:
- TACL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 193–216
- Language:
- URL:
- https://preview.aclanthology.org/ingest-eacl/2026.tacl-1.10/
- DOI:
- 10.1162/tacl.a.600
- Cite (ACL):
- Jaap Jumelet, Leonie Weissweiler, Joakim Nivre, and Arianna Bisazza. 2026. MultiBLiMP 1.0: A Massively Multilingual Benchmark of Linguistic Minimal Pairs. Transactions of the Association for Computational Linguistics, 14:193–216.
- Cite (Informal):
- MultiBLiMP 1.0: A Massively Multilingual Benchmark of Linguistic Minimal Pairs (Jumelet et al., TACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-eacl/2026.tacl-1.10.pdf