Semantic metadata mapping in practice: the Virtual Language Observatory

Dieter Van Uytvanck, Herman Stehouwer, Lari Lampen


Abstract
In this paper we present the Virtual Language Observatory (VLO), a metadata-based portal for language resources. It is completely based on the Component Metadata (CMDI) and ISOcat standards. This approach allows for the use of heterogeneous metadata schemas while maintaining the semantic compatibility. We describe the metadata harvesting process, based on OAI-PMH, and the conversion from several formats (OLAC, IMDI and the CLARIN LRT inventory) to their CMDI counterpart profiles. Then we focus on some post-processing steps to polish the harvested records. Next, the ingestion of the CMDI files into the VLO facet browser is described. We also include an overview of the changes since the first version of the VLO, based on user feedback from the CLARIN community. Finally there is an overview of additional ideas and improvements for future versions of the VLO.
Anthology ID:
L12-1227
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1029–1034
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/437_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Dieter Van Uytvanck, Herman Stehouwer, and Lari Lampen. 2012. Semantic metadata mapping in practice: the Virtual Language Observatory. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1029–1034, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Semantic metadata mapping in practice: the Virtual Language Observatory (Van Uytvanck et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/437_Paper.pdf