Burcu Can

2021

pdf bib abs
Bilingual Terminology Extraction Using Neural Word Embeddings on Comparable Corpora
Darya Filippova | Burcu Can | Gloria Corpas Pastor
Proceedings of the Student Research Workshop Associated with RANLP 2021

Term and glossary management are vital steps of preparation of every language specialist, and they play a very important role at the stage of education of translation professionals. The growing trend of efficient time management and constant time constraints we may observe in every job sector increases the necessity of the automatic glossary compilation. Many well-performing bilingual AET systems are based on processing parallel data, however, such parallel corpora are not always available for a specific domain or a language pair. Domain-specific, bilingual access to information and its retrieval based on comparable corpora is a very promising area of research that requires a detailed analysis of both available data sources and the possible extraction techniques. This work focuses on domain-specific automatic terminology extraction from comparable corpora for the English – Russian language pair by utilizing neural word embeddings.

2020

pdf bib abs
Self Attended Stack-Pointer Networks for Learning Long Term Dependencies
Salih Tuc | Burcu Can
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

We propose a novel deep neural architecture for dependency parsing, which is built upon a Transformer Encoder (Vaswani et al. 2017) and a Stack Pointer Network (Ma et al. 2018). We first encode each sentence using a Transformer Network and then the dependency graph is generated by a Stack Pointer Network by selecting the head of each word in the sentence through a head selection process. We evaluate our model on Turkish and English treebanks. The results show that our trasformer-based model learns long term dependencies efficiently compared to sequential models such as recurrent neural networks. Our self attended stack pointer network improves UAS score around 6% upon the LSTM based stack pointer (Ma et al. 2018) for Turkish sentences with a length of more than 20 words.

2019

2018

pdf bib abs
Characters or Morphemes: How to Represent Words?
Ahmet Üstün | Murathan Kurfalı | Burcu Can
Proceedings of The Third Workshop on Representation Learning for NLP

In this paper, we investigate the effects of using subword information in representation learning. We argue that using syntactic subword units effects the quality of the word representations positively. We introduce a morpheme-based model and compare it against to word-based, character-based, and character n-gram level models. Our model takes a list of candidate segmentations of a word and learns the representation of the word based on different segmentations that are weighted by an attention mechanism. We performed experiments on Turkish as a morphologically rich language and English with a comparably poorer morphology. The results show that morpheme-based models are better at learning word representations of morphologically complex languages compared to character-based and character n-gram level models since the morphemes help to incorporate more syntactic knowledge in learning, that makes morpheme-based models better at syntactic tasks.

pdf bib abs
Tree Structured Dirichlet Processes for Hierarchical Morphological Segmentation
Burcu Can | Suresh Manandhar
Computational Linguistics, Volume 44, Issue 2 - June 2018

This article presents a probabilistic hierarchical clustering model for morphological segmentation. In contrast to existing approaches to morphology learning, our method allows learning hierarchical organization of word morphology as a collection of tree structured paradigms. The model is fully unsupervised and based on the hierarchical Dirichlet process. Tree hierarchies are learned along with the corresponding morphological paradigms simultaneously. Our model is evaluated on Morpho Challenge and shows competitive performance when compared to state-of-the-art unsupervised morphological segmentation systems. Although we apply this model for morphological segmentation, the model itself can also be used for hierarchical clustering of other types of data.

Burcu Can

2021

2020

2019

2018

2013

2012

Co-authors

Venues