Büşra Marşan


2021

pdf bib
Building the Turkish FrameNet
Büşra Marşan | Neslihan Kara | Merve Özçelik | Bilge Nas Arıcan | Neslihan Cesur | Aslı Kuzgun | Ezgi Sanıyar | Oğuzhan Kuyrukçu | Olcay Taner Yildiz
Proceedings of the 11th Global Wordnet Conference

FrameNet (Lowe, 1997; Baker et al., 1998; Fillmore and Atkins, 1998; Johnson et al., 2001) is a computational lexicography project that aims to offer insight into the semantic relationships between predicate and arguments. Having uses in many NLP applications, FrameNet has proven itself as a valuable resource. The main goal of this study is laying the foundation for building a comprehensive and cohesive Turkish FrameNet that is compatible with other resources like PropBank (Kara et al., 2020) or WordNet (Bakay et al., 2019; Ehsani, 2018; Ehsani et al., 2018; Parlar et al., 2019; Bakay et al., 2020) in the Turkish language.

pdf bib
FrameForm: An Open-source Annotation Interface for FrameNet
Büşra Marşan | Olcay Taner Yıldız
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

In this paper, we introduce FrameForm, an open-source annotation tool designed to accommodate predicate annotations based on Frame Semantics. FrameForm is a user-friendly tool for creating, annotating and maintaining computational lexicography projects like FrameNet and has been used while building the Turkish FrameNet. Responsive and open-source, FrameForm can be easily modified to answer the annotation needs of a wide range of different languages.

pdf bib
From Constituency to UD-Style Dependency: Building the First Conversion Tool of Turkish
Aslı Kuzgun | Oğuz Kerem Yıldız | Neslihan Cesur | Büşra Marşan | Arife Betül Yenice | Ezgi Sanıyar | Oguzhan Kuyrukçu | Bilge Nas Arıcan | Olcay Taner Yıldız
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

This paper deliberates on the process of building the first constituency-to-dependency conversion tool of Turkish. The starting point of this work is a previous study in which 10,000 phrase structure trees were manually transformed into Turkish from the original PennTreebank corpus. Within the scope of this project, these Turkish phrase structure trees were automatically converted into UD-style dependency structures, using both a rule-based algorithm and a machine learning algorithm specific to the requirements of the Turkish language. The results of both algorithms were compared and the machine learning approach proved to be more accurate than the rule-based algorithm. The output was revised by a team of linguists. The refined versions were taken as gold standard annotations for the evaluation of the algorithms. In addition to its contribution to the UD Project with a large dataset of 10,000 Turkish dependency trees, this project also fulfills the important gap of a Turkish conversion tool, enabling the quick compilation of dependency corpora which can be used for the training of better dependency parsers.

2020

pdf bib
TRopBank: Turkish PropBank V2.0
Neslihan Kara | Deniz Baran Aslan | Büşra Marşan | Özge Bakay | Koray Ak | Olcay Taner Yıldız
Proceedings of the 12th Language Resources and Evaluation Conference

In this paper, we present and explain TRopBank “Turkish PropBank v2.0”. PropBank is a hand-annotated corpus of propositions which is used to obtain the predicate-argument information of a language. Predicate-argument information of a language can help understand semantic roles of arguments. “Turkish PropBank v2.0”, unlike PropBank v1.0, has a much more extensive list of Turkish verbs, with 17.673 verbs in total.