2020
pdf
Implementing an End-to-End Treebank-Informed Pipeline for Bulgarian
Alexander Popov
|
Petya Osenova
|
Kiril Simov
Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories
pdf
abs
Reconstructing NER Corpora: a Case Study on Bulgarian
Iva Marinova
|
Laska Laskova
|
Petya Osenova
|
Kiril Simov
|
Alexander Popov
Proceedings of the Twelfth Language Resources and Evaluation Conference
The paper reports on the usage of deep learning methods for improving a Named Entity Recognition (NER) training corpus and for predicting and annotating new types in a test corpus. We show how the annotations in a type-based corpus of named entities (NE) were populated as occurrences within it, thus ensuring density of the training information. A deep learning model was adopted for discovering inconsistencies in the initial annotation and for learning new NE types. The evaluation results get improved after data curation, randomization and deduplication.
2019
pdf
abs
Graph Embeddings for Frame Identification
Alexander Popov
|
Jennifer Sikos
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Lexical resources such as WordNet (Miller, 1995) and FrameNet (Baker et al., 1998) are organized as graphs, where relationships between words are made explicit via the structure of the resource. This work explores how structural information from these lexical resources can lead to gains in a downstream task, namely frame identification. While much of the current work in frame identification uses various neural architectures to predict frames, those neural architectures only use representations of frames based on annotated corpus data. We demonstrate how incorporating knowledge directly from the FrameNet graph structure improves the performance of a neural network-based frame identification system. Specifically, we construct a bidirectional LSTM with a loss function that incorporates various graph- and corpus-based frame embeddings for learning and ultimately achieves strong performance gains with the graph-based embeddings over corpus-based embeddings alone.
pdf
abs
Know Your Graph. State-of-the-Art Knowledge-Based WSD
Alexander Popov
|
Kiril Simov
|
Petya Osenova
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
This paper introduces several improvements over the current state of the art in knowledge-based word sense disambiguation. Those innovations are the result of modifying and enriching a knowledge base created originally on the basis of WordNet. They reflect several separate but connected strategies: manipulating the shape and the content of the knowledge base, assigning weights over the relations in the knowledge base, and the addition of new relations to it. The main contribution of the paper is to demonstrate that the previously proposed knowledge bases organize linguistic and world knowledge suboptimally for the task of word sense disambiguation. In doing so, the paper also establishes a new state of the art for knowledge-based approaches. Its best models are competitive in the broader context of supervised systems as well.
2018
pdf
abs
Grammatical Role Embeddings for Enhancements of Relation Density in the Princeton WordNet
Kiril Simov
|
Alexander Popov
|
Iliana Simova
|
Petya Osenova
Proceedings of the 9th Global Wordnet Conference
In this paper we present an approach for training verb subatom embeddings. For each verb we learn several embeddings rather than only one. These embeddings include the verb itself as well as embeddings for each grammatical role of this verb. To give an example, for the verb ‘to give’ we learn four embeddings: one for the lemma ‘give’, one for the subject, one for the direct object and one for the indirect object. We have exploited these grammatical role embeddings in order to add new syntagmatic relations to WordNet. The evaluation of the new relations quality has been done extrinsically through the Knowledge-based Word Sense Disambiguation task.
2017
pdf
abs
Word Sense Disambiguation with Recurrent Neural Networks
Alexander Popov
Proceedings of the Student Research Workshop Associated with RANLP 2017
This paper presents a neural network architecture for word sense disambiguation (WSD). The architecture employs recurrent neural layers and more specifically LSTM cells, in order to capture information about word order and to easily incorporate distributed word representations (embeddings) as features, without having to use a fixed window of text. The paper demonstrates that the architecture is able to compete with the most successful supervised systems for WSD and that there is an abundance of possible improvements to take it to the current state of the art. In addition, it explores briefly the potential of combining different types of embeddings as input features; it also discusses possible ways for generating “artificial corpora” from knowledge bases – for the purpose of producing training data and in relation to possible applications of embedding lemmas and word senses in the same space.
2016
pdf
abs
The Role of the WordNet Relations in the Knowledge-based Word Sense Disambiguation Task
Kiril Simov
|
Alexander Popov
|
Petya Osenova
Proceedings of the 8th Global WordNet Conference (GWC)
In this paper we present an analysis of different semantic relations extracted from WordNet, Extended WordNet and SemCor, with respect to their role in the task of knowledge-based word sense disambiguation. The experiments use the same algorithm and the same test sets, but different variants of the knowledge graph. The results show that different sets of relations have different impact on the results: positive or negative. The beneficial ones are discussed with respect to the combination of relations and with respect to the test set. The inclusion of inference has only a modest impact on accuracy, while the addition of syntactic relations produces stable improvement over the baselines.
pdf
Towards Semantic-based Hybrid Machine Translation between Bulgarian and English
Kiril Simov
|
Petya Osenova
|
Alexander Popov
Proceedings of the 2nd Workshop on Semantics-Driven Machine Translation (SedMT 2016)
2015
pdf
Improving Word Sense Disambiguation with Linguistic Knowledge from a Sense Annotated Treebank
Kiril Simov
|
Alexander Popov
|
Petya Osenova
Proceedings of the International Conference Recent Advances in Natural Language Processing
pdf
bib
Proceedings of the Student Research Workshop
Irina Temnikova
|
Ivelina Nikolova
|
Alexander Popov
Proceedings of the Student Research Workshop
2014
pdf
bib
Joint Ensemble Model for POS Tagging and Dependency Parsing
Iliana Simova
|
Dimitar Vasilev
|
Alexander Popov
|
Kiril Simov
|
Petya Osenova
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages