Elżbieta Hajnicz

Also published as: Elzbieta Hajnicz


2022

pdf
Annotation of metaphorical expressions in the Basic Corpus of Polish Metaphors
Elżbieta Hajnicz
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper presents a corpus of Polish texts annotated with metaphorical expressions. It is composed of two parts of comparable size, selected from two subcorpora of the Polish National Corpus: the subcorpus manually annotated on morphosyntactic level, named entities level etc., and the Polish Coreference Corpus, with manually annotated mentions and the coreference relations between them, but automatically annotated on the morphosyntactic level (only the second part is actually annotated). In the paper we briefly outline the method for identifying metaphorical expressions in a text, based on the MIPVU procedure. The main difference is the stress put on novel metaphors and considering neologistic derivatives that have metaphorical properties. The annotation procedure is based on two notions: vehicle – a part of an expression used metaphorically, representing a source domain and its topic – a part referring to reality, representing a target domain. Next, we propose several features (text form, conceptual structure, conventionality and contextuality) to classify metaphorical expressions identified in texts. Additionally, some metaphorical expressions are identified as concerning personal identity matters and classified w.r.t. their properties. Finally, we analyse and evaluate the results of the annotation.

2020

pdf
Interannotator Agreement for Lexico-Semantic Annotation of a Corpus
Elżbieta Hajnicz
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper examines the procedure for lexico-semantic annotation of the Basic Corpus of Polish Metaphors that is the first step for annotating metaphoric expressions occurring in it. The procedure involves correcting the morphosyntactic annotation of part of the corpus that is automatically annotated on the morphosyntactic level. The main procedure concerns annotation of adjectives, adverbs, nouns and verbs (including gerunds and participles), including abbreviations of the words that belong to the above classes. It is composed of three steps: deciding whether a particular occurrence of a word is asemantic (e.g. anaphoric or strictly grammatical), whether we are dealing with a multi-word expression, reciprocal usages of the się marker and pluralia tantum, which may involve annotation with two lexical units (having two different lemmas) for a single token. We propose an interannotator agreement statistics adequate for this procedure. Finally, we discuss the preliminary results of annotation of a fragment of the corpus.

2019

pdf
Connections between the semantic layer of Walenty valency dictionary and PlWordNet
Elzbieta Hajnicz | Tomasz Bartosiak
Proceedings of the 10th Global Wordnet Conference

In this paper we discuss how Walenty is using PLWORDNET to represent semantic information. We decided to use PLWORDNET lexical units and synsets to describe both the predicate meaning and the semantic fields of its arguments. The original design decision required some further refinement caused by the structure of PLWORDNET and complex relations between arguments.

2018

pdf
A New Version of the Składnica Treebank of Polish Harmonised with the Walenty Valency Dictionary
Marcin Woliński | Elżbieta Hajnicz | Tomasz Bartosiak
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf
Accessing and Elaborating Walenty - a Valence Dictionary of Polish - via Internet Browser
Bartłomiej Nitoń | Tomasz Bartosiak | Elżbieta Hajnicz
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This article presents Walenty - a new valence dictionary of Polish predicates, concentrating on its creation process and access via Internet browser. The dictionary contains two layers, syntactic and semantic. The syntactic layer describes syntactic and morphosyntactic constraints predicates put on their dependants. The semantic layer shows how predicates and their arguments are involved in a situation described in an utterance. These two layers are connected, representing how semantic arguments can be realised on the surface. Walenty also contains a powerful phraseological (idiomatic) component. Walenty has been created and can be accessed remotely with a dedicated tool called Slowal. In this article, we focus on most important functionalities of this system. First, we will depict how to access the dictionary and how built-in filtering system (covering both syntactic and semantic phenomena) works. Later, we will describe the process of creating dictionary by Slowal tool that both supports and controls the work of lexicographers.

pdf
Semantic Layer of the Valence Dictionary of Polish Walenty
Elżbieta Hajnicz | Anna Andrzejczuk | Tomasz Bartosiak
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This article presents the semantic layer of Walenty―a new valence dictionary of Polish predicates, with a number of novel features, as compared to other such dictionaries. The dictionary contains two layers, syntactic and semantic. The syntactic layer describes syntactic and morphosyntactic constraints predicates put on their dependants. In particular, it includes a comprehensive and powerful phraseological component. The semantic layer shows how predicates and their arguments are involved in a described situation in an utterance. These two layers are connected, representing how semantic arguments can be realised on the surface. Each syntactic schema and each semantic frame are illustrated by at least one exemplary sentence attested in linguistic reality. The semantic layer consists of semantic frames represented as lists of pairs and connected with PlWordNet lexical units. Semantic roles have a two-level representation (basic roles are provided with an attribute) enabling representation of arguments in a flexible way. Selectional preferences are based on PlWordNet structure as well.

2014

pdf
Walenty: Towards a comprehensive valence dictionary of Polish
Adam Przepiórkowski | Elżbieta Hajnicz | Agnieszka Patejuk | Marcin Woliński | Filip Skwarski | Marek Świdziński
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper presents Walenty, a comprehensive valence dictionary of Polish, with a number of novel features, as compared to other such dictionaries. The notion of argument is based on the coordination test and takes into consideration the possibility of diverse morphosyntactic realisations. Some aspects of the internal structure of phraseological (idiomatic) arguments are handled explicitly. While the current version of the dictionary concentrates on syntax, it already contains some semantic features, including semantically defined arguments, such as locative, temporal or manner, as well as control and raising, and work on extending it with semantic roles and selectional preferences is in progress. Although Walenty is still being intensively developed, it is already by far the largest Polish valence dictionary, with around 8600 verbal lemmata and almost 39 000 valence schemata. The dictionary is publicly available on the Creative Commons BY SA licence and may be downloaded from http://zil.ipipan.waw.pl/Walenty.

pdf
The Procedure of Lexico-Semantic Annotation of Składnica Treebank
Elżbieta Hajnicz
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper, the procedure of lexico-semantic annotation of Składnica Treebank using Polish WordNet is presented. Other semantically annotated corpora, in particular treebanks, are outlined first. Resources involved in annotation as well as a tool called Semantikon used for it are described. The main part of the paper is the analysis of the applied procedure. It consists of the basic and correction phases. During basic phase all nouns, verbs and adjectives are annotated with wordnet senses. The annotation is performed independently by two linguists. During the correction phase, conflicts are resolved by the linguist supervising the process. Multi-word units obtain special tags, synonyms and hypernyms are used for senses absent in Polish WordNet. Additionally, each sentence receives its general assessment. Finally, some statistics of the results of annotation are given, including inter-annotator agreement. The final resource is represented in XML files preserving the structure of Składnica.

pdf
Lexico-Semantic Annotation of Składnica Treebank by means of PLWN Lexical Units
Elżbieta Hajnicz
Proceedings of the Seventh Global Wordnet Conference

pdf
Extended phraseological information in a valence dictionary for NLP applications
Adam Przepiórkowski | Elżbieta Hajnicz | Agnieszka Patejuk | Marcin Woliński
Proceedings of Workshop on Lexical and Grammatical Resources for Language Processing