This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
OliveraKitanović
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
The paper presents the results of the research related to the preparation of parallel corpora, focusing on transformation into RDF graphs using NLP Interchange Format (NIF) for linguistic annotation. We give an overview of the parallel corpus that was used in this case study, as well as the process of POS tagging, lemmatization, named entity recognition (NER), and named entity linking (NEL), which is implemented using Wikidata. In the first phase of NEL main characters and places mentioned in novels are stored in Wikidata and in the second phase they are linked with the occurrences of previously annotated entities in text. Next, we describe the named entity linking (NEL), data conversion to RDF, and incorporation of NIF annotations. Produced NIF files were evaluated through the exploration of triplestore using SPARQL queries. Finally, the bridging of Linked Data and Digital Humanities research is discussed, as well as some drawbacks related to the verbosity of transformation. Semantic interoperability concept in the context of linked data and parallel corpora ensures that data exchanged between systems carries shared and well-defined meanings, enabling effective communication and understanding.
This paper introduces the results of integration of lexical and terminological resources, most of them developed within the Human Language Technology (HLT) Group at the University of Belgrade, with the Geological information system of Serbia (GeolISS) developed at the Faculty of Mining and Geology and funded by the Ministry of the Environmental protection. The approach to GeolISS development, which is aimed at the integration of existing geologic archives, data from published maps on different scales, newly acquired field data, and intranet and internet publishing of geologic is given, followed by the description of the geologic multilingual vocabulary and other lexical and terminological resources used. Two basic results are outlined: multilingual map annotation and improvement of queries for the GeolISS geodatabase. Multilingual labelling and annotation of maps for their graphic display and printing have been tested with Serbian, which describes regional information in the local language, and English, used for sharing geographic information with the world, although the geological vocabulary offers the possibility for integration of other languages as well. The resources also enable semantic and morphological expansion of queries, the latter being very important in highly inflective languages, such as Serbian.