Utku Türk
Also published as: Utku Turk
2026
Frequency modulates structural choice in Turkish suspended affixation: a latent-process account
Utku Turk | Eva Neu | Özge Bakay | Brian Dillon | Gaja Jarosz
Proceedings of the Society for Computation in Linguistics 2026
Utku Turk | Eva Neu | Özge Bakay | Brian Dillon | Gaja Jarosz
Proceedings of the Society for Computation in Linguistics 2026
Suspended affixation (SA) allows a suffix on one conjunct to scope over all coordinated elements. While inflectional SA is productive in Turkish, derivational SA is claimed to be highly restricted; yet speakers readily accept certain cases. We propose that this gradient acceptability reflects a frequency-modulated choice between two possible syntactic representations: base-generation, which licenses derivational SA, and ellipsis. To test this, we conducted a rating task on the acceptability of four derivational suffixes in SA form while manipulating the frequency of coordinations. Using a Multinomial Processing Tree model to isolate latent structural choices from surface ratings, we found that frequency modulated SA acceptability for some suffixes (i.e., sIz ’-less’ and cI ’-maker’), but not others (i.e., lI ’-having’ and lIk ’-for’). These findings suggest that frequency shapes syntactic parsing in morphologically complex environments.
Quantifying the cross-linguistic effects of syncretism on agreement attraction
Utku Turk | Eva Neu
Proceedings of the Society for Computation in Linguistics 2026
Utku Turk | Eva Neu
Proceedings of the Society for Computation in Linguistics 2026
Agreement attraction errors, in which a verb erroneously agrees with an intervening noun rather than its grammatical head, are amplified by morphological syncretism in some languages (English, German, Russian) but not others (Turkish, Armenian), a cross-linguistic pattern without a principled account. We use surprisal and attention entropy from large language models as processing proxies to investigate this variation across four languages. LLM-derived measures replicate behavioral findings in English and German (syncretism modulates attraction), align with Turkish null results (no modulation), and partially capture Russian patterns. We discuss further directions for better understanding why syncretism affects agreement attraction differently across languages.
2020
First Steps towards Universal Dependencies for Laz
Utku Türk | Kaan Bayar | Ayşegül Dilara Özercan | Görkem Yiğit Öztürk | Şaziye Betül Özateş
Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020)
Utku Türk | Kaan Bayar | Ayşegül Dilara Özercan | Görkem Yiğit Öztürk | Şaziye Betül Özateş
Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020)
This paper presents the first treebank for the Laz language, which is also the first Universal Dependencies Treebank for a South Caucasian language. This treebank aims to create a syntactically and morphologically annotated resource for further research. We also aim to document an endangered language in a systematic fashion within an inherently cross-linguistic framework: the Universal Dependencies Project (UD). As of now, our treebank consists of 576 sentences and 2,306 tokens annotated in light with the UD guidelines. We evaluated the treebank on the dependency parsing task using a pretrained multilingual parsing model, and the results are comparable with other low-resourced treebanks with no training set. We aim to expand our treebank in the near future to include 1,500 sentences. The bigger goal for our project is to create a set of treebanks for minority languages in Anatolia.
2019
Improving the Annotations in the Turkish Universal Dependency Treebank
Utku Türk | Furkan Atmaca | Şaziye Betül Özateş | Balkız Öztürk Başaran | Tunga Güngör | Arzucan Özgür
Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019)
Utku Türk | Furkan Atmaca | Şaziye Betül Özateş | Balkız Öztürk Başaran | Tunga Güngör | Arzucan Özgür
Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019)
Turkish Treebanking: Unifying and Constructing Efforts
Utku Türk | Furkan Atmaca | Şaziye Betül Özateş | Abdullatif Köksal | Balkiz Ozturk Basaran | Tunga Gungor | Arzucan Özgür
Proceedings of the 13th Linguistic Annotation Workshop
Utku Türk | Furkan Atmaca | Şaziye Betül Özateş | Abdullatif Köksal | Balkiz Ozturk Basaran | Tunga Gungor | Arzucan Özgür
Proceedings of the 13th Linguistic Annotation Workshop
In this paper, we present the current version of two different treebanks, the re-annotation of the Turkish PUD Treebank and the first annotation of the Turkish National Corpus Universal Dependency (henceforth TNC-UD). The annotation of both treebanks, the Turkish PUD Treebank and TNC-UD, was carried out based on the decisions concerning linguistic adequacy of re-annotation of the Turkish IMST-UD Treebank (Türk et. al., forthcoming). Both of the treebanks were annotated with the same annotation process and morphological and syntactic analyses. The TNC-UD is planned to have 10,000 sentences. In this paper, we will present the first 500 sentences along with the annotation PUD Treebank. Moreover, this paper also offers the parsing results of a graph-based neural parser on the previous and re-annotated PUD, as well as the TNC-UD. In light of the comparisons, even though we observe a slight decrease in the attachment scores of the Turkish PUD treebank, we demonstrate that the annotation of the TNC-UD improves the parsing accuracy of Turkish. In addition to the treebanks, we have also constructed a custom annotation software with advanced filtering and morphological editing options. Both the treebanks, including a full edit-history and the annotation guidelines, and the custom software are publicly available under an open license online.