Towards Latvian WordNet
Peteris Paikens, Mikus Grasmanis, Agute Klints, Ilze Lokmane, Lauma Pretkalniņa, Laura Rituma, Madara Stāde, Laine Strankale
Abstract
In this paper we describe our current work on creating a WordNet for Latvian based on the principles of the Princeton WordNet. The chosen methodology for word sense definition and sense linking is based on corpus evidence and the existing Tezaurs.lv online dictionary, ensuring a foundation that fits the Latvian language usage and existing linguistic tradition. We cover a wide set of semantic relations, including gradation sets. Currently the dataset consists of 6432 words linked in 5528 synsets, out of which 2717 synsets are considered fully completed as they have all the outgoing semantic links annotated, annotated with corpus examples for each sense and links to the English Princeton WordNet.- Anthology ID:
- 2022.lrec-1.300
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 2808–2815
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.300
- DOI:
- Cite (ACL):
- Peteris Paikens, Mikus Grasmanis, Agute Klints, Ilze Lokmane, Lauma Pretkalniņa, Laura Rituma, Madara Stāde, and Laine Strankale. 2022. Towards Latvian WordNet. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2808–2815, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Towards Latvian WordNet (Paikens et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.lrec-1.300.pdf