Use Case: Romanian Language Resources in the LOD Paradigm
Verginica Barbu Mititelu, Elena Irimia, Vasile Pais, Andrei-Marius Avram, Maria Mitrofan
Abstract
In this paper, we report on (i) the conversion of Romanian language resources to the Linked Open Data specifications and requirements, on (ii) their publication and (iii) interlinking with other language resources (for Romanian or for other languages). The pool of converted resources is made up of the Romanian Wordnet, the morphosyntactic and phonemic lexicon RoLEX, four treebanks, one for the general language (the Romanian Reference Treebank) and others for specialised domains (SiMoNERo for medicine, LegalNERo for the legal domain, PARSEME-Ro for verbal multiword expressions), frequency information on lemmas and tokens and word embeddings as extracted from the reference corpus for contemporary Romanian (CoRoLa) and a bi-modal (text and speech) corpus. We also present the limitations coming from the representation of the resources in Linked Data format. The metadata of LOD resources have been published in the LOD Cloud. The resources are available for download on our website and a SPARQL endpoint is also available for querying them.- Anthology ID:
- 2022.ldl-1.5
- Volume:
- Proceedings of the 8th Workshop on Linked Data in Linguistics within the 13th Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Thierry Declerck, John P. McCrae, Elena Montiel, Christian Chiarcos, Maxim Ionov
- Venue:
- LDL
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 35–44
- Language:
- URL:
- https://preview.aclanthology.org/ingest_wac_2008/2022.ldl-1.5/
- DOI:
- Cite (ACL):
- Verginica Barbu Mititelu, Elena Irimia, Vasile Pais, Andrei-Marius Avram, and Maria Mitrofan. 2022. Use Case: Romanian Language Resources in the LOD Paradigm. In Proceedings of the 8th Workshop on Linked Data in Linguistics within the 13th Language Resources and Evaluation Conference, pages 35–44, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Use Case: Romanian Language Resources in the LOD Paradigm (Barbu Mititelu et al., LDL 2022)
- PDF:
- https://preview.aclanthology.org/ingest_wac_2008/2022.ldl-1.5.pdf
- Data
- LegalNERo, RTASC