2020
pdf
abs
Identification of Indigenous Knowledge Concepts through Semantic Networks, Spelling Tools and Word Embeddings
Renato Rocha Souza
|
Amelie Dorn
|
Barbara Piringer
|
Eveline Wandl-Vogt
Proceedings of the Twelfth Language Resources and Evaluation Conference
In order to access indigenous, regional knowledge contained in language corpora, semantic tools and network methods are most typically employed. In this paper we present an approach for the identification of dialectal variations of words, or words that do not pertain to High German, on the example of non-standard language legacy collection questionnaires of the Bavarian Dialects in Austria (DBÖ). Based on selected cultural categories relevant to the wider project context, common words from each of these cultural categories and their lemmas using GermaLemma were identified. Through word embedding models the semantic vicinity of each word was explored, followed by the use of German Wordnet (Germanet) and the Hunspell tool. Whilst none of these tools have a comprehensive coverage of standard German words, they serve as an indication of dialects in specific semantic hierarchies. Methods and tools applied in this study may serve as an example for other similar projects dealing with non-standard or endangered language collections, aiming to access, analyze and ultimately preserve native regional language heritage.
2017
bib
Proceedings of the Workshop Knowledge Resources for the Socio-Economic Sciences and Humanities associated with RANLP 2017
Kalliopi Zervanou
|
Petya Osenova
|
Eveline Wandl-Vogt
|
Dan Cristea
Proceedings of the Workshop Knowledge Resources for the Socio-Economic Sciences and Humanities associated with RANLP 2017
2014
pdf
bib
How to semantically relate dialectal Dictionaries in the Linked Data Framework
Thierry Declerck
|
Eveline Wandl-Vogt
Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)
pdf
abs
A SKOS-based Schema for TEI encoded Dictionaries at ICLTT
Thierry Declerck
|
Karlheinz Mörth
|
Eveline Wandl-Vogt
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
At our institutes we are working with quite some dictionaries and lexical resources in the field of less-resourced language data, like dialects and historical languages. We are aiming at publishing those lexical data in the Linked Open Data framework in order to link them with available data sets for highly-resourced languages and elevating them thus to the same digital dignity the mainstream languages have gained. In this paper we concentrate on two TEI encoded variants of the Arabic language and propose a mapping of this TEI encoded data onto SKOS, showing how the lexical entries of the two dialectal dictionaries can be linked to other language resources available in the Linked Open Data cloud.