Mikus Grasmanis


2022

pdf
Towards Latvian WordNet
Peteris Paikens | Mikus Grasmanis | Agute Klints | Ilze Lokmane | Lauma Pretkalniņa | Laura Rituma | Madara Stāde | Laine Strankale
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In this paper we describe our current work on creating a WordNet for Latvian based on the principles of the Princeton WordNet. The chosen methodology for word sense definition and sense linking is based on corpus evidence and the existing Tezaurs.lv online dictionary, ensuring a foundation that fits the Latvian language usage and existing linguistic tradition. We cover a wide set of semantic relations, including gradation sets. Currently the dataset consists of 6432 words linked in 5528 synsets, out of which 2717 synsets are considered fully completed as they have all the outgoing semantic links annotated, annotated with corpus examples for each sense and links to the English Princeton WordNet.