Arzucan Özgür

Also published as: Arzucan Ozgur


TULAP - An Accessible and Sustainable Platform for Turkish Natural Language Processing Resources
Susan Uskudarli | Muhammet Şen | Furkan Akkurt | Merve Gürbüz | Onur Gungor | Arzucan Özgür | Tunga Güngör
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

Access to natural language processing resources is essential for their continuous improvement. This can be especially challenging in educational institutions where the software development effort required to package and release research outcomes may be overwhelming and under-recognized. Access towell-prepared and reliable research outcomes is important both for their developers as well as the greater research community. This paper presents an approach to address this concern with two main goals: (1) to create an open-source easily deployable platform where resources can be easily shared and explored, and (2) to use this platform to publish open-source Turkish NLP resources (datasets and tools) created by a research lab. The Turkish Natural Language Processing (TULAP) was designed and developed as an easy-to-use platform to share dataset and tool resources which supports interactive tool demos. Numerous open access Turkish NLP resources have been shared on TULAP. All tools are containerized to support portability for custom use. This paper describes the design, implementation, and deployment of TULAP with use cases (available at A short video demonstrating our system is available at


Identifying Hate Speech Using Neural Networks and Discourse Analysis Techniques
Zehra Melce Hüsünbeyi | Didar Akar | Arzucan Özgür
Proceedings of the First Workshop on Language Technology and Resources for a Fair, Inclusive, and Safe Society within the 13th Language Resources and Evaluation Conference

Discriminatory language, in particular hate speech, is a global problem posing a grave threat to democracy and human rights. Yet, it is not always easy to identify, as it is rarely explicit. In order to detect hate speech, we developed Hierarchical Attention Network (HAN) based and Bidirectional Encoder Representations from Transformer (BERT) based deep learning models to capture the changing discursive cues and understand the context around the discourse. In addition, we designed linguistic features using critical discourse analysis techniques and integrated them into these neural network models. We studied the compatibility of our model with the hate speech detection problem by comparing it with traditional machine learning models, as well as a Convolution Neural Network (CNN) based model, a Convolutional Neural Network-Gated Recurrent Unit (CNN-GRU) based model which reached significant performance results for hate speech detection. Our results on a manually annotated corpus of print media in Turkish show that the proposed approach is effective for hate speech detection. We believe that the feature sets created for the Turkish language will encourage new studies in the quantitative analysis of hate speech.

A Dataset and BERT-based Models for Targeted Sentiment Analysis on Turkish Texts
Mustafa Melih Mutlu | Arzucan Özgür
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Targeted Sentiment Analysis aims to extract sentiment towards a particular target from a given text. It is a field that is attracting attention due to the increasing accessibility of the Internet, which leads people to generate an enormous amount of data. Sentiment analysis, which in general requires annotated data for training, is a well-researched area for widely studied languages such as English. For low-resource languages such as Turkish, there is a lack of such annotated data. We present an annotated Turkish dataset suitable for targeted sentiment analysis. We also propose BERT-based models with different architectures to accomplish the task of targeted sentiment analysis. The results demonstrate that the proposed models outperform the traditional sentiment analysis models for the targeted sentiment analysis task.

Improving Code-Switching Dependency Parsing with Semi-Supervised Auxiliary Tasks
Şaziye Betül Özateş | Arzucan Özgür | Tunga Gungor | Özlem Çetinoğlu
Findings of the Association for Computational Linguistics: NAACL 2022

Code-switching dependency parsing stands as a challenging task due to both the scarcity of necessary resources and the structural difficulties embedded in code-switched languages. In this study, we introduce novel sequence labeling models to be used as auxiliary tasks for dependency parsing of code-switched text in a semi-supervised scheme. We show that using auxiliary tasks enhances the performance of an LSTM-based dependency parsing model and leads to better results compared to an XLM-R-based model with significantly less computational and time complexity. As the first study that focuses on multiple code-switching language pairs for dependency parsing, we acquire state-of-the-art scores on all of the studied languages. Our best models outperform the previous work by 7.4 LAS points on average.

A Framework for Automatic Generation of Spoken Question-Answering Data
Merve Ünlü Menevşe | Yusufcan Manav | Ebru Arisoy | Arzucan Özgür
Findings of the Association for Computational Linguistics: EMNLP 2022

This paper describes a framework to automatically generate a spoken question answering (QA) dataset. The framework consists of a question generation (QG) module to generate questions automatically from given text documents, a text-to-speech (TTS) module to convert the text documents into spoken form and an automatic speech recognition (ASR) module to transcribe the spoken content. The final dataset contains question-answer pairs for both the reference text and ASR transcriptions as well as the audio files corresponding to each reference text. For QG and ASR systems we used pre-trained multilingual encoder-decoder transformer models and fine-tuned these models using a limited amount of manually generated QA data and TTS-based speech data, respectively. As a proof of concept, we investigated the proposed framework for Turkish and generated the Turkish Question Answering (TurQuAse) dataset using Wikipedia articles. Manual evaluation of the automatically generated question- answer pairs and QA performance evaluation with state of-the-art models on TurQuAse show that the proposed framework is efficient for automatically generating spoken QA datasets. To the best of our knowledge, TurQuAse is the first publicly available spoken question answering dataset for Turkish. The proposed framework can be easily extended to other languages where a limited amount of QA data is available.


Overcoming the challenges in morphological annotation of Turkish in universal dependencies framework
Talha Bedir | Karahan Şahin | Onur Gungor | Suzan Uskudarli | Arzucan Özgür | Tunga Güngör | Balkiz Ozturk Basaran
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop

This paper presents several challenges faced when annotating Turkish treebanks in accordance with the Universal Dependencies (UD) guidelines and proposes solutions to address them. Most of these challenges stem from the lack of adequate support in the UD framework to accurately represent null morphemes and complex derivations, which results in a significant loss of information for Turkish. This loss negatively impacts the tools that are developed based on these treebanks. We raised and discussed these issues within the community on the official UD portal. This paper presents these issues and our proposals to more accurately represent morphosyntactic information for Turkish while adhering to guidelines of UD. This work aims to contribute to the representation of Turkish and other agglutinative languages in UD-based treebanks, which in turn aids to develop more accurately annotated datasets for such languages.

BOUN at SemEval-2021 Task 9: Text Augmentation Techniques for Fact Verification in Tabular Data
Abdullatif Köksal | Yusuf Yüksel | Bekir Yıldırım | Arzucan Özgür
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

In this paper, we present our text augmentation based approach for the Table Statement Support Subtask (Phase A) of SemEval-2021 Task 9. We experiment with different text augmentation techniques such as back translation and synonym swapping using Word2Vec and WordNet. We show that text augmentation techniques lead to 2.5% improvement in F1 on the test set. Further, we investigate the impact of domain adaptation and joint learning on fact verification in tabular data by utilizing the SemTabFacts and TabFact datasets. We observe that joint learning improves the F1 scores on the SemTabFacts and TabFact test sets by 3.31% and 0.77%, respectively.

Balancing Methods for Multi-label Text Classification with Long-Tailed Class Distribution
Yi Huang | Buse Giledereli | Abdullatif Köksal | Arzucan Özgür | Elif Ozkirimli
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Multi-label text classification is a challenging task because it requires capturing label dependencies. It becomes even more challenging when class distribution is long-tailed. Resampling and re-weighting are common approaches used for addressing the class imbalance problem, however, they are not effective when there is label dependency besides class imbalance because they result in oversampling of common labels. Here, we introduce the application of balancing loss functions for multi-label text classification. We perform experiments on a general domain dataset with 90 labels (Reuters-21578) and a domain-specific dataset from PubMed with 18211 labels. We find that a distribution-balanced loss function, which inherently addresses both the class imbalance and label linkage problems, outperforms commonly used loss functions. Distribution balancing methods have been successfully used in the image recognition field. Here, we show their effectiveness in natural language processing. Source code is available at


Vapur: A Search Engine to Find Related Protein - Compound Pairs in COVID-19 Literature
Abdullatif Köksal | Hilal Dönmez | Rıza Özçelik | Elif Ozkirimli | Arzucan Özgür
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

Coronavirus Disease of 2019 (COVID-19) created dire consequences globally and triggered an intense scientific effort from different domains. The resulting publications created a huge text collection in which finding the studies related to a biomolecule of interest is challenging for general purpose search engines because the publications are rich in domain specific terminology. Here, we present Vapur: an online COVID-19 search engine specifically designed to find related protein - chemical pairs. Vapur is empowered with a relation-oriented inverted index that is able to retrieve and group studies for a query biomolecule with respect to its related entities. The inverted index of Vapur is automatically created with a BioNLP pipeline and integrated with an online user interface. The online interface is designed for the smooth traversal of the current literature by domain researchers and is publicly available at

Analyzing ELMo and DistilBERT on Socio-political News Classification
Berfu Büyüköz | Ali Hürriyetoğlu | Arzucan Özgür
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020

This study evaluates the robustness of two state-of-the-art deep contextual language representations, ELMo and DistilBERT, on supervised learning of binary protest news classification (PC) and sentiment analysis (SA) of product reviews. A ”cross-context” setting is enabled using test sets that are distinct from the training data. The models are fine-tuned and fed into a Feed-Forward Neural Network (FFNN) and a Bidirectional Long Short Term Memory network (BiLSTM). Multinomial Naive Bayes (MNB) and Linear Support Vector Machine (LSVM) are used as traditional baselines. The results suggest that DistilBERT can transfer generic semantic knowledge to other domains better than ELMo. DistilBERT is also 30% smaller and 83% faster than ELMo, which suggests superiority for smaller computational training budgets. When generalization is not the utmost preference and test domain is similar to the training domain, the traditional machine learning (ML) algorithms can still be considered as more economic alternatives to deep language representations.

The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification
Abdullatif Köksal | Arzucan Özgür
Findings of the Association for Computational Linguistics: EMNLP 2020

Relation classification is one of the key topics in information extraction, which can be used to construct knowledge bases or to provide useful information for question answering. Current approaches for relation classification are mainly focused on the English language and require lots of training data with human annotations. Creating and annotating a large amount of training data for low-resource languages is impractical and expensive. To overcome this issue, we propose two cross-lingual relation classification models: a baseline model based on Multilingual BERT and a new multilingual pretraining setup, which significantly improves the baseline with distant supervision. For evaluation, we introduce a new public benchmark dataset for cross-lingual relation classification in English, French, German, Spanish, and Turkish, called RELX. We also provide the RELX-Distant dataset, which includes hundreds of thousands of sentences with relations from Wikipedia and Wikidata collected by distant supervision for these languages. Our code and data are available at:


BOUN-ISIK Participation: An Unsupervised Approach for the Named Entity Normalization and Relation Extraction of Bacteria Biotopes
İlknur Karadeniz | Ömer Faruk Tuna | Arzucan Özgür
Proceedings of the 5th Workshop on BioNLP Open Shared Tasks

This paper presents our participation to the Bacteria Biotope Task of the BioNLP Shared Task 2019. Our participation includes two systems for the two subtasks of the Bacteria Biotope Task: the normalization of entities (BB-norm) and the identification of the relations between the entities given a biomedical text (BB-rel). For the normalization of entities, we utilized word embeddings and syntactic re-ranking. For the relation extraction task, pre-defined rules are used. Although both approaches are unsupervised, in the sense that they do not need any labeled data, they achieved promising results. Especially, for the BB-norm task, the results have shown that the proposed method performs as good as deep learning based methods, which require labeled data.

Turkish Treebanking: Unifying and Constructing Efforts
Utku Türk | Furkan Atmaca | Şaziye Betül Özateş | Abdullatif Köksal | Balkiz Ozturk Basaran | Tunga Gungor | Arzucan Özgür
Proceedings of the 13th Linguistic Annotation Workshop

In this paper, we present the current version of two different treebanks, the re-annotation of the Turkish PUD Treebank and the first annotation of the Turkish National Corpus Universal Dependency (henceforth TNC-UD). The annotation of both treebanks, the Turkish PUD Treebank and TNC-UD, was carried out based on the decisions concerning linguistic adequacy of re-annotation of the Turkish IMST-UD Treebank (Türk et. al., forthcoming). Both of the treebanks were annotated with the same annotation process and morphological and syntactic analyses. The TNC-UD is planned to have 10,000 sentences. In this paper, we will present the first 500 sentences along with the annotation PUD Treebank. Moreover, this paper also offers the parsing results of a graph-based neural parser on the previous and re-annotated PUD, as well as the TNC-UD. In light of the comparisons, even though we observe a slight decrease in the attachment scores of the Turkish PUD treebank, we demonstrate that the annotation of the TNC-UD improves the parsing accuracy of Turkish. In addition to the treebanks, we have also constructed a custom annotation software with advanced filtering and morphological editing options. Both the treebanks, including a full edit-history and the annotation guidelines, and the custom software are publicly available under an open license online.

Improving the Annotations in the Turkish Universal Dependency Treebank
Utku Türk | Furkan Atmaca | Şaziye Betül Özateş | Balkız Öztürk Başaran | Tunga Güngör | Arzucan Özgür
Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019)

Turkish Tweet Classification with Transformer Encoder
Atıf Emre Yüksel | Yaşar Alim Türkmen | Arzucan Özgür | Berna Altınel
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Short-text classification is a challenging task, due to the sparsity and high dimensionality of the feature space. In this study, we aim to analyze and classify Turkish tweets based on their topics. Social media jargon and the agglutinative structure of the Turkish language makes this classification task even harder. As far as we know, this is the first study that uses a Transformer Encoder for short text classification in Turkish. The model is trained in a weakly supervised manner, where the training data set has been labeled automatically. Our results on the test set, which has been manually labeled, show that performing morphological analysis improves the classification performance of the traditional machine learning algorithms Random Forest, Naive Bayes, and Support Vector Machines. Still, the proposed approach achieves an F-score of 89.3 % outperforming those algorithms by at least 5 points.


A Morphology-Based Representation Model for LSTM-Based Dependency Parsing of Agglutinative Languages
Şaziye Betül Özateş | Arzucan Özgür | Tunga Güngör | Balkız Öztürk
Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

We propose two word representation models for agglutinative languages that better capture the similarities between words which have similar tasks in sentences. Our models highlight the morphological features in words and embed morphological information into their dense representations. We have tested our models on an LSTM-based dependency parser with character-based word embeddings proposed by Ballesteros et al. (2015). We participated in the CoNLL 2018 Shared Task on multilingual parsing from raw text to universal dependencies as the BOUN team. We show that our morphology-based embedding models improve the parsing performance for most of the agglutinative languages.


BUSEM at SemEval-2017 Task 4A Sentiment Analysis with Word Embedding and Long Short Term Memory RNN Approaches
Deger Ayata | Murat Saraclar | Arzucan Ozgur
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper describes our approach for SemEval-2017 Task 4: Sentiment Analysis in Twitter. We have participated in Subtask A: Message Polarity Classification subtask and developed two systems. The first system uses word embeddings for feature representation and Support Vector Machine, Random Forest and Naive Bayes algorithms for classification of Twitter messages into negative, neutral and positive polarity. The second system is based on Long Short Term Memory Recurrent Neural Networks and uses word indexes as sequence of inputs for feature representation.


Towards Building a Political Protest Database to Explain Changes in the Welfare State
Çağıl Sönmez | Arzucan Özgür | Erdem Yörük
Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities

Ontology-Based Categorization of Bacteria and Habitat Entities using Information Retrieval Techniques
Mert Tiftikci | Hakan Şahin | Berfu Büyüköz | Alper Yayıkçı | Arzucan Özgür
Proceedings of the 4th BioNLP Shared Task Workshop

Named Entity Recognition on Twitter for Turkish using Semi-supervised Learning with Word Embeddings
Eda Okur | Hakan Demir | Arzucan Özgür
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Recently, due to the increasing popularity of social media, the necessity for extracting information from informal text types, such as microblog texts, has gained significant attention. In this study, we focused on the Named Entity Recognition (NER) problem on informal text types for Turkish. We utilized a semi-supervised learning approach based on neural networks. We applied a fast unsupervised method for learning continuous representations of words in vector space. We made use of these obtained word embeddings, together with language independent features that are engineered to work better on informal text types, for generating a Turkish NER system on microblog texts. We evaluated our Turkish NER system on Twitter messages and achieved better F-score performances than the published results of previously proposed NER systems on Turkish tweets. Since we did not employ any language dependent features, we believe that our method can be easily adapted to microblog texts in other morphologically rich languages.

Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization
Şaziye Betül Özateş | Arzucan Özgür | Dragomir Radev
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We introduce an approach based on using the dependency grammar representations of sentences to compute sentence similarity for extractive multi-document summarization. We adapt and investigate the effects of two untyped dependency tree kernels, which have originally been proposed for relation extraction, to the multi-document summarization problem. In addition, we propose a series of novel dependency grammar based kernels to better represent the syntactic and semantic similarities among the sentences. The proposed methods incorporate the type information of the dependency relations for sentence similarity calculation. To our knowledge, this is the first study that investigates using dependency tree based sentence similarity for multi-document summarization.

Segmenting Hashtags using Automatically Created Training Data
Arda Çelebi | Arzucan Özgür
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Hashtags, which are commonly composed of multiple words, are increasingly used to convey the actual messages in tweets. Understanding what tweets are saying is getting more dependent on understanding hashtags. Therefore, identifying the individual words that constitute a hashtag is an important, yet a challenging task due to the abrupt nature of the language used in tweets. In this study, we introduce a feature-rich approach based on using supervised machine learning methods to segment hashtags. Our approach is unsupervised in the sense that instead of using manually segmented hashtags for training the machine learning classifiers, we automatically create our training data by using tweets as well as by automatically extracting hashtag segmentations from a large corpus. We achieve promising results with such automatically created noisy training data.


Expanding machine translation training data with an out-of-domain corpus using language modeling based vocabulary saturation
Burak Aydın | Arzucan Özgür
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

The training data size is of utmost importance for statistical machine translation (SMT), since it affects the training time, model size, decoding speed, as well as the system’s overall success. One of the challenges for developing SMT systems for languages with less resources is the limited sizes of the available training data. In this paper, we propose an approach for expanding the training data by including parallel texts from an out-of-domain corpus. Selecting the best out-of-domain sentences for inclusion in the training set is important for the overall performance of the system. Our method is based on first ranking the out-of-domain sentences using a language modeling approach, and then, including the sentences to the training set by using the vocabulary saturation filter technique. We evaluated our approach for the English-Turkish language pair and obtained promising results. Performance improvements of up to +0.8 BLEU points for the English-Turkish translation system are achieved. We compared our results with the translation model combination approaches as well and reported the improvements. Moreover, we implemented our system with dependency parse tree based language modeling in addition to the n-gram based language modeling and reported comparable results.

A Graph-based Approach for Contextual Text Normalization
Cagil Sönmez | Arzucan Özgür
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Analyzing Stemming Approaches for Turkish Multi-Document Summarization
Muhammed Yavuz Nuzumlalı | Arzucan Özgür
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Self-training a Constituency Parser using n-gram Trees
Arda Çelebi | Arzucan Özgür
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this study, we tackle the problem of self-training a feature-rich discriminative constituency parser. We approach the self-training problem with the assumption that while the full sentence parse tree produced by a parser may contain errors, some portions of it are more likely to be correct. We hypothesize that instead of feeding the parser the guessed full sentence parse trees of its own, we can break them down into smaller ones, namely n-gram trees, and perform self-training on them. We build an n-gram parser and transfer the distinct expertise of the $n$-gram parser to the full sentence parser by using the Hierarchical Joint Learning (HJL) approach. The resulting jointly self-trained parser obtains slight improvement over the baseline.


Bacteria Biotope Detection, Ontology-based Normalization, and Relation Extraction using Syntactic Rules
İlknur Karadeniz | Arzucan Özgür
Proceedings of the BioNLP Shared Task 2013 Workshop

BOUNCE: Sentiment Classification in Twitter using Rich Feature Sets
Nadin Kökciyan | Arda Çelebi | Arzucan Özgür | Suzan Üsküdarlı
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)


Citation Summarization Through Keyphrase Extraction
Vahed Qazvinian | Dragomir R. Radev | Arzucan Özgür
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)


Supervised Classification for Extracting Biomedical Events
Arzucan Özgür | Dragomir Radev
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task

Detecting Speculations and their Scopes in Scientific Text
Arzucan Özgür | Dragomir R. Radev
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing


Semi-Supervised Classification for Extracting Protein Interaction Sentences using Dependency Parsing
Güneş Erkan | Arzucan Özgür | Dragomir R. Radev
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)