MultiBLiMP 1.0: A Massively Multilingual Benchmark of Linguistic Minimal Pairs

Jaap Jumelet; Leonie Weissweiler; Joakim Nivre; Arianna Bisazza

doi:10.1162/tacl.a.600

MultiBLiMP 1.0: A Massively Multilingual Benchmark of Linguistic Minimal Pairs

Jaap Jumelet, Leonie Weissweiler, Joakim Nivre, Arianna Bisazza

Abstract

We introduce MultiBLiMP 1.0, a massively multilingual benchmark of linguistic minimal pairs, covering 101 languages and 2 types of subject-verb agreement, containing more than 128,000 minimal pairs. Our minimal pairs are created using a fully automated pipeline, leveraging the large-scale linguistic resources of Universal Dependencies and UniMorph. MultiBLiMP 1.0 evaluates abilities of LLMs at an unprecedented multilingual scale, and highlights the shortcomings of the current state-of-the-art in modelling low-resource languages.1

Anthology ID:: 2026.tacl-1.10
Volume:: Transactions of the Association for Computational Linguistics, Volume 14
Month:
Year:: 2026
Address:: Cambridge, MA
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 193–216
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.tacl-1.10/
DOI:: 10.1162/tacl.a.600
Bibkey:
Cite (ACL):: Jaap Jumelet, Leonie Weissweiler, Joakim Nivre, and Arianna Bisazza. 2026. MultiBLiMP 1.0: A Massively Multilingual Benchmark of Linguistic Minimal Pairs. Transactions of the Association for Computational Linguistics, 14:193–216.
Cite (Informal):: MultiBLiMP 1.0: A Massively Multilingual Benchmark of Linguistic Minimal Pairs (Jumelet et al., TACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.tacl-1.10.pdf

PDF Cite Search Fix data