Svetlozara Leseva


Linked Resources towards Enhancing the Conceptual Description of General Lexis Verbs Using Syntactic Information
Svetlozara Leseva | Ivelina Stoyanova
Proceedings of the 5th International Conference on Computational Linguistics in Bulgaria (CLIB 2022)



Semantic Analysis of Verb-Noun Derivation in Princeton WordNet
Verginica Mititelu | Svetlozara Leseva | Ivelina Stoyanova
Proceedings of the 11th Global Wordnet Conference

We present here the results of a morphosemantic analysis of the verb-noun pairs in the Princeton WordNet as reflected in the standoff file containing pairs annotated with a set of 14 semantic relations. We have automatically distinguished between zero-derivation and affixal derivation in the data and identified the affixes and manually checked the results. The data show that for each semantic relation an affix prevails in creating new words, although we cannot talk about their specificity with respect to such a relation. Moreover, certain pairs of verb-noun semantic primes are better represented for each semantic relation, and some semantic clusters (in the form of WordNet subtrees) take shape as a result. We thus employ a large-scale data-driven linguistically motivated analysis afforded by the rich derivational and morphosemantic description in WordNet to the end of capturing finer regularities in the process of derivation as represented in the semantic properties of the words involved and as reflected in the structure of the lexicon.


It Takes Two to Tango – Towards a Multilingual MWE Resource
Svetlozara Leseva | Verginica Barbu Mititelu | Ivelina Stoyanova
Proceedings of the 4th International Conference on Computational Linguistics in Bulgaria (CLIB 2020)

Mature wordnets offer the opportunity of digging out interesting linguistic information otherwise not explicitly marked in the network. The focus in this paper is on the ways the results already obtained at two levels, derivation and multiword expressions, may be further employed. The parallel recent development of the two resources under discussion, the Bulgarian and the Romanian wordnets, has enabled interlingual analyses that reveal similarities and differences between the linguistic knowledge encoded in the two wordnets. In this paper we show how the resources developed and the knowledge gained are put together towards devising a linked MWE resource that is informed by layered dictionary representation and corpus annotation and analysis. This work is a proof of concept for the adopted method of compiling a multilingual MWE resource on the basis of information extracted from the Bulgarian, the Romanian and the Princeton wordnet, as well as additional language resources and automatic procedures.

Consistency Evaluation towards Enhancing the Conceptual Representation of Verbs in WordNet
Svetlozara Leseva | Ivelina Stoyanova
Proceedings of the 4th International Conference on Computational Linguistics in Bulgaria (CLIB 2020)

This paper outlines the process of enhancing the conceptual description of verb synsets in WordNet using FrameNet frames. On the one hand we expand the coverage of the mapping between WordNet and FrameNet, while on the other – we improve the quality of the mapping using a set of consistency checks and verification procedures. The procedures include an automatic identification of potential inconsistencies and imbalanced relations, as well as suggestions for a more precise frame assignment followed by manual validation. We perform an evaluation of the procedures in terms of the quality of the suggestions measured as the potential improvement in precision and coverage, the relevance of the result and the efficiency of the procedure.


Enhancing Conceptual Description through Resource Linking and Exploration of Semantic Relations
Ivelina Stoyanova | Svetlozara Leseva
Proceedings of the 10th Global Wordnet Conference

The paper presents current efforts towards linking two large lexical semantic resources – WordNet and FrameNet – to the end of their mutual enrichment and the facilitation of the access, extraction and analysis of various types of semantic and syntactic information. In the second part of the paper, we go on to examine the relation of inheritance and other semantic relations as represented in WordNet and FrameNet and how they correspond to each other when the resources are aligned. We discuss the implications with respect to the enhancement of the two resources through the definition of new relations and the detailisation of conceptual frames.

pdf bib
Hear about Verbal Multiword Expressions in the Bulgarian and the Romanian Wordnets Straight from the Horse’s Mouth
Verginica Barbu Mititelu | Ivelina Stoyanova | Svetlozara Leseva | Maria Mitrofan | Tsvetana Dimitrova | Maria Todorova
Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)

In this paper we focus on verbal multiword expressions (VMWEs) in Bulgarian and Romanian as reflected in the wordnets of the two languages. The annotation of VMWEs relies on the classification defined within the PARSEME Cost Action. After outlining the properties of various types of VMWEs, a cross-language comparison is drawn, aimed to highlight the similarities and the differences between Bulgarian and Romanian with respect to the lexicalization and distribution of VMWEs. The contribution of this work is in outlining essential features of the description and classification of VMWEs and the cross-language comparison at the lexical level, which is essential for the understanding of the need for uniform annotation guidelines and a viable procedure for validation of the annotation.

Structural Approach to Enhancing WordNet with Conceptual Frame Semantics
Svetlozara Leseva | Ivelina Stoyanova
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

This paper outlines procedures for enhancing WordNet with conceptual information from FrameNet. The mapping of the two resources is non-trivial. We define a number of techniques for the validation of the consistency of the mapping and the extension of its coverage which make use of the structure of both resources and the systematic relations between synsets in WordNet and between frames in FrameNet, as well as between synsets and frames). We present a case study on causativity, a relation which provides enhancement complementary to the one using hierarchical relations, by means of linking in a systematic way large parts of the lexicon. We show how consistency checks and denser relations may be implemented on the basis of this relation. We, then, propose new frames based on causative-inchoative correspondences and in conclusion touch on the possibilities for defining new frames based on the types of specialisation that takes place from parent to child synset.


Automatic Prediction of Morphosemantic Relations
Svetla Koeva | Svetlozara Leseva | Ivelina Stoyanova | Tsvetana Dimitrova | Maria Todorova
Proceedings of the 8th Global WordNet Conference (GWC)

This paper presents a machine learning method for automatic identification and classification of morphosemantic relations (MSRs) between verb and noun synset pairs in the Bulgarian WordNet (BulNet). The core training data comprise 6,641 morphosemantically related verb–noun literal pairs from BulNet. The core dataset were preprocessed quality-wise by applying validation and reorganisation procedures. Further, the data were supplemented with negative examples of literal pairs not linked by an MSR. The designed supervised machine learning method uses the RandomTree algorithm and is implemented in Java with the Weka package. A set of experiments were performed to test various approaches to the task. Future work on improving the classifier includes adding more training data, employing more features, and fine-tuning. Apart from the language specific information about derivational processes, the proposed method is language independent.


Automatic Classification of WordNet Morphosemantic Relations
Svetlozara Leseva | Ivelina Stoyanova | Maria Todorova | Tsvetana Dimitrova | Borislav Rizov | Svetla Koeva
The 5th Workshop on Balto-Slavic Natural Language Processing


Wordnet-Based Cross-Language Identification of Semantic Relations
Ivelina Stoyanova | Svetla Koeva | Svetlozara Leseva
Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing

Text Modification for Bulgarian Sign Language Users
Slavina Lozanova | Ivelina Stoyanova | Svetlozara Leseva | Svetla Koeva | Boian Savtchev
Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations


Application of Clause Alignment for Statistical Machine Translation
Svetla Koeva | Svetlozara Leseva | Ivelina Stoyanova | Rositsa Dekova | Angel Genov | Borislav Rizov | Tsvetana Dimitrova | Ekaterina Tarpomanova | Hristina Kukova
Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation


Chooser: a Multi-Task Annotation Tool
Svetla Koeva | Borislav Rizov | Svetlozara Leseva
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

The paper presents a tool assisting manual annotation of linguistic data developed at the Department of Computational linguistics, IBL-BAS. Chooser is a general-purpose modular application for corpus annotation based on the principles of commonality and reusability of the created resources, language and theory independence, extendibility and user-friendliness. These features have been achieved through a powerful abstract architecture within the Model-View-Controller paradigm that is easily tailored to task-specific requirements and readily extendable to new applications. The tool is to a considerable extent independent of data format and representation and produces outputs that are largely consistent with existing standards. The annotated data are therefore reusable in tasks requiring different levels of annotation and are accessible to external applications. The tool incorporates edit functions, pass and arrangement strategies that facilitate annotators’ work. The relevant module produces tree-structured and graph-based representations in respective annotation modes. Another valuable feature of the application is concurrent access by multiple users and centralised storage of lexical resources underlying annotation schemata, as well as of annotations, including frequency of selection, updates in the lexical database, etc. Chooser has been successfully applied to a number of tasks: POS tagging, WS and syntactic annotation.