Wiktor Walentynowicz


2021

pdf bib
Enriching plWordNet with morphology
Agnieszka Dziob | Wiktor Walentynowicz
Proceedings of the 11th Global Wordnet Conference

In the paper, we present the process of adding morphological information to the Polish WordNet (plWordNet). We describe the reasons for this connection and the intuitions behind it. We also draw attention to the specificity of the Polish morphology. We show in which tasks the morphological information is important and how the methods can be developed by extending them to include combined morphological information based on WordNet.

2019

pdf bib
Tagger for Polish Computer Mediated Communication Texts
Wiktor Walentynowicz | Maciej Piasecki | Marcin Oleksy
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

In this paper we present a morpho-syntactic tagger dedicated to Computer-mediated Communication texts in Polish. Its construction is based on an expanded RNN-based neural network adapted to the work on noisy texts. Among several techniques, the tagger utilises fastText embedding vectors, sequential character embedding vectors, and Brown clustering for the coarse-grained representation of sentence structures. In addition a set of manually written rules was proposed for post-processing. The system was trained to disambiguate descriptions of words in relation to Parts of Speech tags together with the full morphological information in terms of values for the different grammatical categories. We present also evaluation of several model variants on the gold standard annotated CMC data, comparison to the state-of-the-art taggers for Polish and error analysis. The proposed tagger shows significantly better results in this domain and demonstrates the viability of adaptation.