Baiba Saulīte

Also published as: Baiba Saulite


2020

pdf bib
Deriving a PropBank Corpus from Parallel FrameNet and UD Corpora
Normunds Gruzitis | Roberts Darģis | Laura Rituma | Gunta Nešpore-Bērzkalne | Baiba Saulite
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet

We propose an approach for generating an accurate and consistent PropBank-annotated corpus, given a FrameNet-annotated corpus which has an underlying dependency annotation layer, namely, a parallel Universal Dependencies (UD) treebank. The PropBank annotation layer of such a multi-layer corpus can be semi-automatically derived from the existing FrameNet and UD annotation layers, by providing a mapping configuration from lexical units in [a non-English language] FrameNet to [English language] PropBank predicates, and a mapping configuration from FrameNet frame elements to PropBank semantic arguments for the given pair of a FrameNet frame and a PropBank predicate. The latter mapping generally depends on the underlying UD syntactic relations. To demonstrate our approach, we use Latvian FrameNet, annotated on top of Latvian UD Treebank, for generating Latvian PropBank in compliance with the Universal Propositions approach.

2018

pdf bib
Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU
Normunds Gruzitis | Lauma Pretkalnina | Baiba Saulite | Laura Rituma | Gunta Nespore-Berzkalne | Arturs Znotins | Peteris Paikens
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
Tēzaurs.lv: the Largest Open Lexical Database for Latvian
Andrejs Spektors | Ilze Auzina | Roberts Dargis | Normunds Gruzitis | Peteris Paikens | Lauma Pretkalnina | Laura Rituma | Baiba Saulite
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We describe an extensive and versatile lexical resource for Latvian, an under-resourced Indo-European language, which we call Tezaurs (Latvian for ‘thesaurus’). It comprises a large explanatory dictionary of more than 250,000 entries that are derived from more than 280 external sources. The dictionary is enriched with phonetic, morphological, semantic and other annotations, as well as augmented by various language processing tools allowing for the generation of inflectional forms and pronunciation, for on-the-fly selection of corpus examples, for suggesting synonyms, etc. Tezaurs is available as a public and widely used web application for end-users, as an open data set for the use in language technology (LT), and as an API ― a set of web services for the integration into third-party applications. The ultimate goal of Tezaurs is to be the central computational lexicon for Latvian, bringing together all Latvian words and frequently used multi-word units and allowing for the integration of other LT resources and tools.

2011

pdf bib
A Prague Markup Language profile for the SemTi-Kamols grammar model
Lauma Pretkalniņa | Gunta Nešpore | Kristīne Levāne-Petrova | Baiba Saulīte
Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011)

2007

pdf bib
Dependency-Based Hybrid Model of Syntactic Analysis for the Languages with a Rather Free Word Order
Guntis Bārzdiņš | Normunds Grūzītis | Gunta Nešpore | Baiba Saulīte
Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA 2007)