Özge Bakay
2026
Frequency modulates structural choice in Turkish suspended affixation: a latent-process account
Utku Turk | Eva Neu | Özge Bakay | Brian Dillon | Gaja Jarosz
Proceedings of the Society for Computation in Linguistics 2026
Utku Turk | Eva Neu | Özge Bakay | Brian Dillon | Gaja Jarosz
Proceedings of the Society for Computation in Linguistics 2026
Suspended affixation (SA) allows a suffix on one conjunct to scope over all coordinated elements. While inflectional SA is productive in Turkish, derivational SA is claimed to be highly restricted; yet speakers readily accept certain cases. We propose that this gradient acceptability reflects a frequency-modulated choice between two possible syntactic representations: base-generation, which licenses derivational SA, and ellipsis. To test this, we conducted a rating task on the acceptability of four derivational suffixes in SA form while manipulating the frequency of coordinations. Using a Multinomial Processing Tree model to isolate latent structural choices from surface ratings, we found that frequency modulated SA acceptability for some suffixes (i.e., sIz ’-less’ and cI ’-maker’), but not others (i.e., lI ’-having’ and lIk ’-for’). These findings suggest that frequency shapes syntactic parsing in morphologically complex environments.
From Dependency to CCG to Incremental CCG: Approaches to Flexible Word Order in Turkish
Özge Bakay | Oğuz Kerem Yıldız | Rajesh Bhatt | Brian Dillon | Olcay Taner Yildiz
Proceedings of the 30th Conference on Computational Natural Language Learning
Özge Bakay | Oğuz Kerem Yıldız | Rajesh Bhatt | Brian Dillon | Olcay Taner Yildiz
Proceedings of the 30th Conference on Computational Natural Language Learning
Combinatory Categorial Grammar (CCG), a lexicalized formalism known for its flexible constituency, is well-suited for modeling headfinal languages with flexible word order like Turkish. Building on Kuzgun et al. (2023), we first develop a Turkish CCG lexicon by automatically inducing categories from a dependency treebank. By leveraging standard and extended operations tailored to Turkish syntax, our parser achieves a robust coverage of 92.5%. Furthermore, we introduce the first (partially) incremental, left-to-right CCG parser for Turkish, designed to facilitate the immediate integration of words into the evolving representation. Finally, we present an example experiment showing that CCG parsers can model psycholinguistic evidence for extra processing costs associated with arguments in noncanonical positions, via the frequency of order-reversing operations. These findings provide evidence that CCG offers a cognitively plausible framework for modeling real-time processing in languages like Turkish.
2022
Time Travel in Turkish: WordNets for Modern Turkish
Ceren Oksal | Hikmet N. Oguz | Mert Catal | Nurkay Erbay | Ozgecan Yuzer | Ipek B. Unsal | Oguzhan Kuyrukcu | Arife B. Yenice | Aslı Kuzgun | Büşra Marşan | Ezgi Sanıyar | Bilge Arican | Merve Dogan | Özge Bakay | Olcay Taner Yıldız
Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference
Ceren Oksal | Hikmet N. Oguz | Mert Catal | Nurkay Erbay | Ozgecan Yuzer | Ipek B. Unsal | Oguzhan Kuyrukcu | Arife B. Yenice | Aslı Kuzgun | Büşra Marşan | Ezgi Sanıyar | Bilge Arican | Merve Dogan | Özge Bakay | Olcay Taner Yıldız
Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference
Wordnets have been popular tools for providing and representing semantic and lexical relations of languages. They are useful tools for various purposes in NLP studies. Many researches created WordNets for different languages. For Turkish, there are two WordNets, namely the Turkish WordNet of BalkaNet and KeNet. In this paper, we present new WordNets for Turkish each of which is based on one of the first 9 editions of the Turkish dictionary starting from the 1944 edition. These WordNets are historical in nature and make implications for Modern Turkish. They are developed by extending KeNet, which was created based on the 2005 and 2011 editions of the Turkish dictionary. In this paper, we explain the steps in creating these 9 new WordNets for Turkish, discuss the challenges in the process and report comparative results about the WordNets.
2021
Turkish WordNet KeNet
Özge Bakay | Özlem Ergelen | Elif Sarmış | Selin Yıldırım | Bilge Nas Arıcan | Atilla Kocabalcıoğlu | Merve Özçelik | Ezgi Sanıyar | Oğuzhan Kuyrukçu | Begüm Avar | Olcay Taner Yıldız
Proceedings of the 11th Global Wordnet Conference
Özge Bakay | Özlem Ergelen | Elif Sarmış | Selin Yıldırım | Bilge Nas Arıcan | Atilla Kocabalcıoğlu | Merve Özçelik | Ezgi Sanıyar | Oğuzhan Kuyrukçu | Begüm Avar | Olcay Taner Yıldız
Proceedings of the 11th Global Wordnet Conference
Currently, there are two available wordnets for Turkish: TR-wordnet of BalkaNet and KeNet. As the more comprehensive wordnet for Turkish, KeNet includes 76,757 synsets. KeNet has both intralingual semantic relations and is linked to PWN through interlingual relations. In this paper, we present the procedure adopted in creating KeNet, give details about our approach in annotating semantic relations such as hypernymy and discuss the language-specific problems encountered in these processes.
HisNet: A Polarity Lexicon based on WordNet for Emotion Analysis
Merve Özçelik | Bilge Nas Arıcan | Özge Bakay | Elif Sarmış | Özlem Ergelen | Nilgün Güler Bayezit | Olcay Taner Yıldız
Proceedings of the 11th Global Wordnet Conference
Merve Özçelik | Bilge Nas Arıcan | Özge Bakay | Elif Sarmış | Özlem Ergelen | Nilgün Güler Bayezit | Olcay Taner Yıldız
Proceedings of the 11th Global Wordnet Conference
Dictionary-based methods in sentiment analysis have received scholarly attention recently, the most comprehensive examples of which can be found in English. However, many other languages lack polarity dictionaries, or the existing ones are small in size as in the case of SentiTurkNet, the first and only polarity dictionary in Turkish. Thus, this study aims to extend the content of SentiTurkNet by comparing the two available WordNets in Turkish, namely KeNet and TR-wordnet of BalkaNet. To this end, a current Turkish polarity dictionary has been created relying on 76,825 synsets matching KeNet, where each synset has been annotated with three polarity labels, which are positive, negative and neutral. Meanwhile, the comparison of KeNet and TR-wordnet of BalkaNet has revealed their weaknesses such as the repetition of the same senses, lack of necessary merges of the items belonging to the same synset and the presence of redundant narrower versions of synsets, which are discussed in light of their potential to the improvement of the current lexical databases of Turkish.
2020
TRopBank: Turkish PropBank V2.0
Neslihan Kara | Deniz Baran Aslan | Büşra Marşan | Özge Bakay | Koray Ak | Olcay Taner Yıldız
Proceedings of the Twelfth Language Resources and Evaluation Conference
Neslihan Kara | Deniz Baran Aslan | Büşra Marşan | Özge Bakay | Koray Ak | Olcay Taner Yıldız
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper, we present and explain TRopBank “Turkish PropBank v2.0”. PropBank is a hand-annotated corpus of propositions which is used to obtain the predicate-argument information of a language. Predicate-argument information of a language can help understand semantic roles of arguments. “Turkish PropBank v2.0”, unlike PropBank v1.0, has a much more extensive list of Turkish verbs, with 17.673 verbs in total.
2019
Comparing Sense Categorization Between English PropBank and English WordNet
Özge Bakay | Begüm Avar | Olcay Taner Yıldız
Proceedings of the 10th Global Wordnet Conference
Özge Bakay | Begüm Avar | Olcay Taner Yıldız
Proceedings of the 10th Global Wordnet Conference
Given the fact that verbs play a crucial role in language comprehension, this paper presents a study which compares the verb senses in English PropBank with the ones in English WordNet through manual tagging. After analyzing 1554 senses in 1453 distinct verbs, we have found out that while the majority of the senses in PropBank have their one-to-one correspondents in WordNet, a substantial amount of them are differentiated. Furthermore, by analysing the differences between our manually-tagged and an automatically-tagged resource, we claim that manual tagging can help provide better results in sense annotation.
English-Turkish Parallel Semantic Annotation of Penn-Treebank
Bilge Nas Arıcan | Özge Bakay | Begüm Avar | Olcay Taner Yıldız | Özlem Ergelen
Proceedings of the 10th Global Wordnet Conference
Bilge Nas Arıcan | Özge Bakay | Begüm Avar | Olcay Taner Yıldız | Özlem Ergelen
Proceedings of the 10th Global Wordnet Conference
This paper reports our efforts in constructing a sense-labeled English-Turkish parallel corpus using the traditional method of manual tagging. We tagged a pre-built parallel treebank which was translated from the Penn Treebank corpus. This approach allowed us to generate a resource combining syntactic and semantic information. We provide statistics about the corpus itself as well as information regarding its development process.
Search
Fix author
Co-authors
- Olcay Taner Yıldız 7
- Bilge Nas Arıcan 3
- Begüm Avar 3
- Özlem Ergelen 3
- Brian W. Dillon 2
- Oğuzhan Kuyrukçu 2
- Büşra Marşan 2
- Ezgi Sanıyar 2
- Elif Sarmış 2
- Merve Özçelik 2
- Koray Ak 1
- Bilge Arican 1
- Deniz Baran Aslan 1
- Nilgün Güler Bayezit 1
- Rajesh Bhatt 1
- Mert Catal 1
- Merve Doğan 1
- Nurkay Erbay 1
- Gaja Jarosz 1
- Neslihan Kara 1
- Atilla Kocabalcıoğlu 1
- Aslı Kuzgun 1
- Eva Neu 1
- Hikmet N. Oguz 1
- Ceren Oksal 1
- Utku Türk 1
- Ipek B. Unsal 1
- Arife B. Yenice 1
- Ozgecan Yuzer 1
- Selin Yıldırım 1
- Oğuz Kerem Yıldız 1