Angelica Lo Duca


2014

pdf
Accommodations in Tuscany as Linked Data
Clara Bacciu | Angelica Lo Duca | Andrea Marchetti | Maurizio Tesconi
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The OpeNER Linked Dataset (OLD) contains 19.140 entries about accommodations in Tuscany (Italy). For each accommodation, it describes the type, e.g. hotel, bed and breakfast, hostel, camping etc., and other useful information, such as a short description, the Web address, its location and the features it provides. OLD is the linked data version of the open dataset provided by Fondazione Sistema Toscana, the representative system for tourism in Tuscany. In addition, to the original dataset, OLD provides also the link of each accommodation to the most common social media (Facebook, Foursquare, Google Places and Booking). OLD exploits three common ontologies of the accommodation domain: Acco, Hontology and GoodRelations. The idea is to provide a flexible dataset, which speaks more than one ontology. OLD is available as a SPARQL node and is released under the Creative Commons release. Finally, OLD is developed within the OpeNER European project, which aims at building a set of ready to use tools to recognize and disambiguate entity mentions and perform sentiment analysis and opinion detection on texts. Within the project, OLD provides a named entity repository for entity disambiguation.

pdf
Sharing Cultural Heritage: the Clavius on the Web Project
Matteo Abrate | Angelo Mario Del Grosso | Emiliano Giovannetti | Angelica Lo Duca | Damiana Luzzi | Lorenzo Mancini | Andrea Marchetti | Irene Pedretti | Silvia Piccini
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In the last few years the amount of manuscripts digitized and made available on the Web has been constantly increasing. However, there is still a considarable lack of results concerning both the explicitation of their content and the tools developed to make it available. The objective of the Clavius on the Web project is to develop a Web platform exposing a selection of Christophorus Clavius letters along with three different levels of analysis: linguistic, lexical and semantic. The multilayered annotation of the corpus involves a XML-TEI encoding followed by a tokenization step where each token is univocally identified through a CTS urn notation and then associated to a part-of-speech and a lemma. The text is lexically and semantically annotated on the basis of a lexicon and a domain ontology, the former structuring the most relevant terms occurring in the text and the latter representing the domain entities of interest (e.g. people, places, etc.). Moreover, each entity is connected to linked and non linked resources, including DBpedia and VIAF. Finally, the results of the three layers of analysis are gathered and shown through interactive visualization and storytelling techniques. A demo version of the integrated architecture was developed.