A Morphology-Aware Evaluation of Turkish Syntax in Large Language Models

Ezgi Başar; Arianna Bisazza

A Morphology-Aware Evaluation of Turkish Syntax in Large Language Models

Abstract

Minimal pair benchmarks have become a common approach for evaluating the syntactic knowledge of language models (LMs). However, the creation of such benchmarks often overlooks language-specific confounders that may affect model performance, particularly in the case of morphologically rich languages. In this paper, we investigate how surface-level factors such as morpheme count, subword count, and sentence length influence the performance of LMs on a Turkish benchmark of linguistic minimal pairs. We further analyze whether a tokenizer’s degree of alignment with morphological boundaries can serve as a proxy for model performance. Finally, we test whether the distribution of morphemes in a minimal pair benchmark can skew model performance. Our results show that while surface factors have limited predictive power, they might still serve as a systematic source of bias. Moreover, we find that morphological alignment can roughly correspond to model performance, and morpheme-level imbalances in the benchmark may have a significant influence on results.

Anthology ID:: 2026.sigturk-1.9
Volume:: Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Kemal Oflazer, Abdullatif Köksal, Onur Varol
Venues:: SIGTURK | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 95–102
Language:
URL:: https://preview.aclanthology.org/manual-author-scripts/2026.sigturk-1.9/
DOI:
Bibkey:
Cite (ACL):: Ezgi Başar and Arianna Bisazza. 2026. A Morphology-Aware Evaluation of Turkish Syntax in Large Language Models. In Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026), pages 95–102, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: A Morphology-Aware Evaluation of Turkish Syntax in Large Language Models (Başar & Bisazza, SIGTURK 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/manual-author-scripts/2026.sigturk-1.9.pdf

PDF Cite Search Fix data