Paola Velardi

Also published as: P. Velardi

2022

pdf abs
A Large Interlinked Knowledge Graph of the Italian Cultural Heritage
Stefano Faralli | Andrea Lenzi | Paola Velardi
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Knowledge is the lifeblood for a plethora of applications such as search, recommender systems and natural language understanding. Thanks to the efforts in the fields of Semantic Web and Linked Open Data a growing number of interlinked knowledge bases are supporting the development of advanced knowledge-based applications. Unfortunately, for a large number of domain-specific applications, these knowledge bases are unavailable. In this paper, we present a resource consisting of a large knowledge graph linking the Italian cultural heritage entities (defined in the ArCo ontology) with the concepts defined on well-known knowledge bases (i.e., DBpedia and the Getty GVP ontology). We describe the methodologies adopted for the semi-automatic resource creation and provide an in-depth analysis of the resulting interlinked graph.

2020

pdf abs
Multiple Knowledge GraphDB (MKGDB)
Stefano Faralli | Paola Velardi | Farid Yusifli
Proceedings of the Twelfth Language Resources and Evaluation Conference

We present MKGDB, a large-scale graph database created as a combination of multiple taxonomy backbones extracted from 5 existing knowledge graphs, namely: ConceptNet, DBpedia, WebIsAGraph, WordNet and the Wikipedia category hierarchy. MKGDB, thanks the versatility of the Neo4j graph database manager technology, is intended to favour and help the development of open-domain natural language processing applications relying on knowledge bases, such as information extraction, hypernymy discovery, topic clustering, and others. Our resource consists of a large hypernymy graph which counts more than 37 million nodes and more than 81 million hypernymy relations.

2018

pdf
A Large Multilingual and Multi-domain Dataset for Recommender Systems
Giorgia Di Tommaso | Stefano Faralli | Paola Velardi
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf abs
Hashtag Sense Clustering Based on Temporal Similarity
Giovanni Stilo | Paola Velardi
Computational Linguistics, Volume 43, Issue 1 - April 2017

Hashtags are creative labels used in micro-blogs to characterize the topic of a message/discussion. Regardless of the use for which they were originally intended, hashtags cannot be used as a means to cluster messages with similar content. First, because hashtags are created in a spontaneous and highly dynamic way by users in multiple languages, the same topic can be associated with different hashtags, and conversely, the same hashtag may refer to different topics in different time periods. Second, contrary to common words, hashtag disambiguation is complicated by the fact that no sense catalogs (e.g., Wikipedia or WordNet) are available; and, furthermore, hashtag labels are difficult to analyze, as they often consist of acronyms, concatenated words, and so forth. A common way to determine the meaning of hashtags has been to analyze their context, but, as we have just pointed out, hashtags can have multiple and variable meanings. In this article, we propose a temporal sense clustering algorithm based on the idea that semantically related hashtags have similar and synchronous usage patterns.

pdf abs
What to Write? A topic recommender for journalists
Alessandro Cucchiarelli | Christian Morbidoni | Giovanni Stilo | Paola Velardi
Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism

In this paper we present a recommender system, What To Write and Why, capable of suggesting to a journalist, for a given event, the aspects still uncovered in news articles on which the readers focus their interest. The basic idea is to characterize an event according to the echo it receives in online news sources and associate it with the corresponding readers’ communicative and informative patterns, detected through the analysis of Twitter and Wikipedia, respectively. Our methodology temporally aligns the results of this analysis and recommends the concepts that emerge as topics of interest from Twitter andWikipedia, either not covered or poorly covered in the published news articles.

2013

pdf
OntoLearn Reloaded: A Graph-Based Algorithm for Taxonomy Induction
Paola Velardi | Stefano Faralli | Roberto Navigli
Computational Linguistics, Volume 39, Issue 3 - September 2013

pdf
Automated learning of everyday patients’ language for medical blogs analytics
Giovanni Stilo | Moreno De Vincenzi | Alberto E. Tozzi | Paola Velardi
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

2012

pdf abs
A New Method for Evaluating Automatically Learned Terminological Taxonomies
Paola Velardi | Roberto Navigli | Stefano Faralli | Juana Maria Ruiz Martinez
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Evaluating a taxonomy learned automatically against an existing gold standard is a very complex problem, because differences stem from the number, label, depth and ordering of the taxonomy nodes. In this paper we propose casting the problem as one of comparing two hierarchical clusters. To this end we defined a variation of the Fowlkes and Mallows measure (Fowlkes and Mallows, 1983). Our method assigns a similarity value B^i_(l,r) to the learned (l) and reference (r) taxonomy for each cut i of the corresponding anonymised hierarchies, starting from the topmost nodes down to the leaf concepts. For each cut i, the two hierarchies can be seen as two clusterings C^i_l , C^i_r of the leaf concepts. We assign a prize to early similarity values, i.e. when concepts are clustered in a similar way down to the lowest taxonomy levels (close to the leaf nodes). We apply our method to the evaluation of the taxonomy learning methods put forward by Navigli et al. (2011) and Kozareva and Hovy (2010).

2010

pdf
Learning Word-Class Lattices for Definition and Hypernym Extraction
Roberto Navigli | Paola Velardi
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf abs
An Annotated Dataset for Extracting Definitions and Hypernyms from the Web
Roberto Navigli | Paola Velardi | Juana Maria Ruiz-Martínez
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper presents and analyzes an annotated corpus of definitions, created to train an algorithm for the automatic extraction of definitions and hypernyms from web documents. As an additional resource, we also include a corpus of non-definitions with syntactic patterns similar to those of definition sentences, e.g.: ""An android is a robot"" vs. ""Snowcap is unmistakable"". Domain and style independence is obtained thanks to the annotation of a large and domain-balanced corpus and to a novel pattern generalization algorithm based on word-class lattices (WCL). A lattice is a directed acyclic graph (DAG), a subclass of nondeterministic finite state automata (NFA). The lattice structure has the purpose of preserving the salient differences among distinct sequences, while eliminating redundant information. The WCL algorithm will be integrated into an improved version of the GlossExtractor Web application (Velardi et al., 2008). This paper is mostly concerned with a description of the corpus, the annotation strategy, and a linguistic analysis of the data. A summary of the WCL algorithm is also provided for the sake of completeness.

Co-authors

Venues

lrec8
ws7
cl6
coling4
acl3
show all...

Paola Velardi

2022

2020

2018

2017

2013

2012

2010

2006

2004

2002

2001

2000

1998

1997

1996

1994

1993

1992

1991

1990

1989

1987

Co-authors

Venues