TurBLiMP: A Turkish Benchmark of Linguistic Minimal Pairs

Ezgi Başar, Francesca Padovani, Jaap Jumelet, Arianna Bisazza


Abstract
We introduce TurBLiMP, the first Turkish benchmark of linguistic minimal pairs, designed to evaluate the linguistic abilities of monolingual and multilingual language models (LMs). Covering 16 linguistic phenomena with 1000 minimal pairs each, TurBLiMP fills an important gap in linguistic evaluation resources for Turkish. In designing the benchmark, we give extra attention to two properties of Turkish that remain understudied in current syntactic evaluations of LMs, namely word order flexibility and subordination through morphological processes. Our experiments on a wide range of LMs and a newly collected set of human acceptability judgments reveal that even cutting-edge Large LMs still struggle with grammatical phenomena that are not challenging for humans, and may also exhibit different sensitivities to word order and morphological complexity compared to humans.
Anthology ID:
2025.emnlp-main.834
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16506–16521
Language:
URL:
https://preview.aclanthology.org/lei-li-partial-disambiguation/2025.emnlp-main.834/
DOI:
Bibkey:
Cite (ACL):
Ezgi Başar, Francesca Padovani, Jaap Jumelet, and Arianna Bisazza. 2025. TurBLiMP: A Turkish Benchmark of Linguistic Minimal Pairs. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 16506–16521, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
TurBLiMP: A Turkish Benchmark of Linguistic Minimal Pairs (Başar et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/lei-li-partial-disambiguation/2025.emnlp-main.834.pdf
Checklist:
 2025.emnlp-main.834.checklist.pdf