Duško Vitas


2014

pdf bib
Enriching SerbianWordNet and Electronic Dictionaries with Terms from the Culinary Domain
Staša Vujičić Stanković | Cvetana Krstev | Duško Vitas
Proceedings of the Seventh Global Wordnet Conference

2011

pdf bib
A tagged and aligned corpus for the study of Proper Names in translation
Emeline Lecuit | Denis Maurel | Duško Vitas
Proceedings of The Second Workshop on Annotation and Exploitation of Parallel Corpora

pdf bib
E-Dictionaries and Finite-State Automata for the Recognition of Named Entities
Cvetana Krstev | Duško Vitas | Ivan Obradović | Miloš Utvić
Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing

2010

pdf bib
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration
Cvetana Krstev | Ranka Stanković | Duško Vitas
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same morphological property using different approaches. We propose a new morphological description for Serbian following the feature structure representation defined by the ISO standard. In this description we try do incorporate all characteristics of Serbian that need to be specified for various applications. We have developed several XSLT scripts that transform our description into descriptions needed for various applications. We have developed the first version of this new description, but we treat it as an ongoing project because for some properties we have not yet found the satisfactory solution.

2009

pdf bib
E-Connecting Balkan Languages
Cvetana Krstev | Ranka Stanković | Duško Vitas | Svetla Koeva
Proceedings of the Workshop Multilingual resources, technologies and evaluation for central and Eastern European languages

2008

pdf bib
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines
Cvetana Krstev | Ranka Stanković | Duško Vitas | Ivan Obradović
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic and morphological expansion of the query, the latter being very important in highly inflective languages, such as Serbian. Wordnets can also be used for adding another language to a query, if appropriate, thus making the query bilingual. Problems encountered in retrieving documents of interest are discussed and illustrated by examples. A brief description of resources is given, followed by an outline of the web tool which enables their integration. Finally, a set of examples is chosen in order to illustrate the use of the lexical resources and tool in question. Results obtained for these examples show that the number of documents obtained through a query by using our approach can double and even quadruple in some cases.

2006

pdf bib
WS4LR: A Workstation for Lexical Resources
Cvetana Krstev | Ranka Stanković | Duško Vitas | Ivan Obradović
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we describe WS4LR, the workstation for lexical resources, a software tool developed within the Human Language Technology Group at the Faculty of Mathematics, University of Belgrade. The tool is aimed at manipulating heterogeneous lexical resources, and the need for such a tool came from the large volume of resources the Group has developed in the course of many years and within different projects. The tool handles morphological dictionaries, wordnets, aligned texts and transducers equally and has already proved very useful for various tasks. Although it has so far been used mainly for Serbian, WS4LR is not language dependent and can be successfully used for resources in other languages provided that they follow the described formats and methodologies. The tool operates on the .NET platform and runs on a personal computer under Windows 2000/XP/2003 operating system with at least 256MB of internal memory.

2004

pdf bib
Combining Heterogeneous Lexical Resources
Cvetana Krstev | Duško Vitas | Ranka Stankoviæ | Ivan Obradoviæ | Gordana Pavloviæ-Lažetiæ
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
The MULTEXT-East Morphosyntactic Specification for Slavic Languages
Tomaž Erjavec | Cvetana Krstev | Vladimír Petkevič | Kiril Simov | Marko Tadić | Duško Vitas
Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages

pdf bib
Composite Tense Recognition and Tagging in Serbian
Duško Vitas | Cvetana Krstev
Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages