Database of Latvian Morphemes and Derivational Models: ideas and expected results

Andra Kalnača, Tatjana Pakalne, Kristīne Levāne-Petrova


Abstract
In this paper, we describe “The Database of Latvian Morphemes and Derivational Models” – a large-scale corpus-based and manually validated database of Latvian derivational morphology currently in development at the University of Latvia. The database contains morpheme-level data – morphemes, incl. morpheme variants (allomorphs), morpheme types, morpheme homonymy/ homography resolu- tion, hierarchical relations between root morphemes, links to word families, and lemma-level data – incl. base form, morphemic segmentation, POS, grammatical features, derivational motivation (incl. compounding), word-family membership. The focus of the database is on providing linguistically accurate comprehensive data as a reliable basis for future work in different fields.
Anthology ID:
2025.nodalida-1.29
Volume:
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
Month:
march
Year:
2025
Address:
Tallinn, Estonia
Editors:
Richard Johansson, Sara Stymne
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
279–286
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.nodalida-1.29/
DOI:
Bibkey:
Cite (ACL):
Andra Kalnača, Tatjana Pakalne, and Kristīne Levāne-Petrova. 2025. Database of Latvian Morphemes and Derivational Models: ideas and expected results. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 279–286, Tallinn, Estonia. University of Tartu Library.
Cite (Informal):
Database of Latvian Morphemes and Derivational Models: ideas and expected results (Kalnača et al., NoDaLiDa 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.nodalida-1.29.pdf