A Meta-data Driven Platform for Semi-automatic Configuration of Ontology Mediators
Manuel Fiorelli
Maria Teresa Pazienza
Armando Stellato
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Ontology mediators often demand extensive configuration, or even the adaptation of the input ontologies for remedying unsupported modeling patterns. In this paper we propose MAPLE (MAPping Architecture based on Linguistic Evidences), an architecture and software platform that semi-automatically solves this configuration problem, by reasoning on metadata about the linguistic expressivity of the input ontologies, the available mediators and other components relevant to the mediation task. In our methodology mediators should access the input ontologies through uniform interfaces abstracting many low-level details, while depending on generic third-party linguistic resources providing external information. Given a pair of ontologies to reconcile, MAPLE ranks the available mediators according to their ability to exploit most of the input ontologies content, while coping with the exhibited degree of linguistic heterogeneity. MAPLE provides the chosen mediator with concrete linguistic resources and suitable implementations of the required interfaces. The resulting mediators are more robust, as they are isolated from many low-level issues, and their applicability and performance may increase over time as new and better resources and other components are made available. To sustain this trend, we foresee the use of the Web as a large scale repository.
LIME: Towards a Metadata Module for Ontolex
Manuel Fiorelli
Maria Teresa Pazienza
Armando Stellato
Proceedings of the 2nd Workshop on Linked Data in Linguistics (LDL-2013): Representing and linking lexicons, terminologies and other language data
PEARL: ProjEction of Annotations Rule Language, a Language for Projecting (UIMA) Annotations over RDF Knowledge Bases
Maria Teresa Pazienza
Armando Stellato
Andrea Turbati
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
In this paper we present a language, PEARL, for projecting annotations based on the Unstructured Information Management Architecture (UIMA) over RDF triples. The language offer is twofold: first, a query mechanism, built upon (and extending) the basic FeaturePath notation of UIMA, allows for efficient access to the standard annotation format of UIMA based on feature structures. PEARL then provides a syntax for projecting the retrieved information onto an RDF Dataset, by using a combination of a SPARQL-like notation for matching pre-existing elements of the dataset and of meta-graph patterns, for storing new information into it. In this paper we present the basics of this language and how a PEARL document is structured, discuss a simple use-case and introduce a wider project about automatic acquisition of knowledge, in which PEARL plays a pivotal role.
Application of a Semantic Search Algorithm to Semi-Automatic GUI Generation
Maria Teresa Pazienza
Noemi Scarpato
Armando Stellato
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
The Semantic Search research field aims to query metadata and to identify relevant subgraphs. While in traditional search engines queries are composed by lists of keywords connected through boolean operators, Semantic Search instead, requires the submission of semantic queries that are structured as a graph of concepts, entities and relations. Submission of this graph is however not trivial as while a list of keywords of interest can be provided by any user, the formulation of semantic queries is not easy as well. One of the main challenges of RDF Browsers lies in the implementation of interfaces that allow the common user to submit semantic queries by hiding their complexity. Furthermore a good semantic search algorithm is not enough to fullfil user needs, it is worthwhile to implement visualization methods which can support users in intuitively understanding why and how the results were retrieved. In this paper we present a novel solution to query RDF datasets and to browse the results of the queries in an appealing manner.
Generic Ontology Learners on Application Domains
Francesca Fallucchi
Maria Teresa Pazienza
Fabio Massimo Zanzotto
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
In ontology learning from texts, we have ontology-rich domains where we have large structured domain knowledge repositories or we have large general corpora with large general structured knowledge repositories such as WordNet (Miller, 1995). Ontology learning methods are more useful in ontology-poor domains. Yet, in these conditions, these methods have not a particularly high performance as training material is not sufficient. In this paper we present an LSP ontology learning method that can exploit models learned from a generic domain to extract new information in a specific domain. In our model, we firstly learn a model from training data and then we use the learned model to discover knowledge in a specific domain. We tested our model adaptation strategy using a background domain that is applied to learn the isa networks in the Earth Observation Domain as a specific domain. We will demonstrate that our method captures domain knowledge better than other generic models: our model better captures what is expected by domain experts than a baseline method based only on WordNet. This latter is better correlated with non-domain annotators asked to produce the ontology for the specific domain.
Maskkot — An Entity-centric Annotation Platform
Armando Stellato
Heiko Stoermer
Stefano Bortoli
Noemi Scarpato
Andrea Turbati
Paolo Bouquet
Maria Teresa Pazienza
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
The Semantic Web is facing the important challenge to maintain its promise of a real world-wide graph of interconnected resources. Unfortunately, while URIs almost guarantee a direct reference to entities, the relation between the two is not bijective. Many different URI references to same concepts and entities can arise when -- in such a heterogeneous setting as the WWW -- people independently build new ontologies, or populate shared ones with new arbitrarily identified individuals. The proliferation of URIs is an unwanted, though natural effect strictly bound to the same principles which characterize the Semantic Web; reducing this phenomenon will improve the recall of Semantic Search engines, which could rely on explicit links between heterogeneous information sources. To address this problem, in this paper we present an integrated environment combining the semantic annotation and ontology building features available in the Semantic Turkey web browser extension, with globally unique identifiers for entities provided by the okkam Entity Name System, thus realizing a valuable resource for preventing diffusion of multiple URIs on the (Semantic) Web.
A Web Browser Extension for Growing-up Ontological Knowledge from Traditional Web Content
Maria Teresa Pazienza
Marco Pennacchiotti
Armando Stellato
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
While the Web is facing interesting new changes in the way users access, interact and even participate to its growth, the most traditional applications dedicated to its fruition: web browsers, are not responding with the same euphoric boost for innovation, mostly relying on third party or open-source community-driven extensions for addressing the new Social and Semantic Web trends and technologies. This technological and decisional gap, which is probably due to the lack of a strong standardization commitment on the one side (Web 2.0/Social Web) and in the delay of massive adherence to new officially approved standards (W3C approved Semantic Web languages), has to be filled by successful stories which could lay the path for the evolution of browsers. In this work we present a novel web browser extension which combines several features coming from the worlds of terminology and information extraction, semantic annotation and knowledge management, to support users in the process of both keeping track of interesting information they find on the web, and organizing its associated content following knowledge representation standards offered by the Semantic Web
JMWNL: an Extensible Multilingual Library for Accessing Wordnets in Different Languages
Maria Teresa Pazienza
Armando Stellato
Alexandra Tudorache
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
In this paper we present JMWNL, a multilingual extension of the JWNL java library, which was originally developed for accessing Princeton WordNet dictionaries. JMWNL broadens the range of JWNLs accessible resources by covering also dictionaries produced inside the EuroWordNet project. Specific resources, such as language-dependent algorithmic stemmers, have been adopted to cover the diversities in the morphological nature of words in the addressed idioms. New semantic and lexical relations have been included to maximize compatibility with new versions of the original Princeton WordNet and to include the whole range of relations from EuroWordNet. Relations from Princeton WordNet on one side and EuroWordNet on the other one have in some cases been mapped to provide a uniform reference for coherent cross-linguistic use of the library.
A Bottom-up Comparative Study of EuroWordNet and WordNet 3.0 Lexical and Semantic Relations
Maria Teresa Pazienza
Armando Stellato
Alexandra Tudorache
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
The paper presents a comparative study of semantic and lexical relations defined and adopted in WordNet and EuroWordNet. This document describes the experimental observations achieved through the analysis of data from different WordNet versions and EuroWordNet distributions for different languages, during the development of JMWNL (Java Multilingual WordNet Library), an extensible multilingual library for accessing WordNet-like resources in different languages and formats. The goal of this work was to realize an operative mapping between the relations defined in the two lexical resources and to unify library access and content navigation methods for both WordNet and EuroWordNet. The analysis focused on similarities, differences, semantic overlaps or inclusions, factual misinterpretations and inconsistencies between the intended and practical use of each single relation defined in these two linguistic resources. The paper details with examples the produced mapping, discussing required operations which implied merging, extending or simply keeping separate the examined relations
Clustering of Terms from Translation Dictionaries and Synonyms Lists to Automatically Build more Structured Linguistic Resources
Maria Teresa Pazienza
Armando Stellato
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Building a Linguistic Resource (LR) is a task requiring a huge quantitative of means, human resources and funds. Though finalization of the development phase and assessment of the produced resource, necessarily require human involvement, a computer aided process for building the resources initial structure would greatly reduce the overall effort to be undertaken. We present here a novel approach for automatizing the process of building structured (possibly multilingual) LRs, starting from already available LRs and exploiting simple vocabularies of synonyms and/or translations for different languages. A simple algorithm for clustering terms, according to their shared senses, is presented in two versions, both for separating flat list of synonyms and flat lists of translations. The algorithm is then motivated against two possible exploitations: reducing the cost for producing new LRs, and linguistically enriching the content of existing semantic resources, like SW ontologies and knowledge bases. Empirical results are provided for two experimental setups: automatic term clustering for English synonyms list, and for Italian translations of English terms
Discovering Asymmetric Entailment Relations between Verbs Using Selectional Preferences
Fabio Massimo Zanzotto
Marco Pennacchiotti
Maria Teresa Pazienza
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
Mixing WordNet, VerbNet and PropBank for studying verb relations
Maria Teresa Pazienza
Marco Pennacchiotti
Fabio Massimo Zanzotto
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
In this paper we present a novel resource for studying the semantics of verb relations. The resource is created by mixing sense relational knowledge enclosed in WordNet, frame knowledge enclosed in VerbNet and corpus knowledge enclosed in PropBank. As a result, a set of about 1000 frame pairs is made available. A frame pair represents a pair of verbs in a peculiar semantic relation accompanied with specific information, such as: the syntactic-semantic frames of the two verbs, the mapping among their thematic roles and a set of textual examples extracted from the PennTreeBank. We specifically focus on four relations: Troponymy, Causation, Entailment and Antonymy. The different steps required for the mapping are described in detail and statistics on resource mutual coverage are reported. We also propose a practical use of the resource for the task of Textual Entailment acquisition and for Question Answering. A first attempt for automate the mapping among verb arguments is also presented: early experiments show that simple techniques can achieve good results, up to 85% F-Measure.
Discovering Entailment Relations Using “Textual Entailment Patterns”
Fabio Massimo Zanzotto
Maria Teresa Pazienza
Marco Pennacchiotti
Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
A2Q: An Agent-based Architecure for Multilingual Q&A
Roberto Basili
Nicola Lorusso
Maria Teresa Pazienza
Fabio Massimo Zanzotto
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
Ontological resources and question answering
Roberto Basili
Dorte H. Hansen
Patrizia Paggio
Maria Teresa Pazienza
Fabio Massimo Zanzotto
Proceedings of the Workshop on Pragmatics of Question Answering at HLT-NAACL 2004
Demonstration of the CROSSMARC System
Vangelis Karkaletsis
Constantine D. Spyropoulos
Dimitris Souflis
Claire Grover
Ben Hachey
Maria Teresa Pazienza
Michele Vindigni
Emmanuel Cartier
Jose Coch
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations
Knowledge-Based Multilingual Document Analysis
R. Basili
R. Catizone
L. Padro
M.T. Pazienza
G. Rigau
A. Setzer
N. Webb
F. Zanzotto
COLING-02: SEMANET: Building and Using Semantic Networks
Multilingual XML-Based Named Entity Recognition for E-Retail Domains
Claire Grover
Scott McDonald
Donnla Nic Gearailt
Vangelis Karkaletsis
Dimitra Farmakiotou
Georgios Samaritakis
Georgios Petasis
Maria Teresa Pazienza
Michele Vindigni
Frantz Vichot
Francis Wolinski
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)
Decision Trees as Explicit Domain Term Definitions
Roberto Basili
Maria Teresa Pazienza
Fabio Massimo Zanzotto
COLING 2002: The 19th International Conference on Computational Linguistics
Multilingual Authoring: the NAMIC Approach
Roberto Basili
Maria Teresa Pazienza
Fabio Massimo Zanzotto
Roberta Catizone
Andrea Setzer
Nick Webb
Yorick Wilks
Lluís Padró
German Rigau
Proceedings of the ACL 2001 Workshop on Human Language Technology and Knowledge Management
Tuning Lexicons to New Operational Scenarios
Roberto Basili
Maria Teresa Pazienza
Michele Vindigni
Fabio Massimo Zanzotto
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
The Italian Syntactic-Semantic Treebank: Architecture, Annotation, Tools and Evaluation
S. Montemagni
F. Barsotti
M. Battista
N. Calzolari
O. Corazzari
A. Zampolli
F. Fanciulli
M. Massetani
R. Raffaelli
R. Basili
M. T. Pazienza
D. Saracino
F. Zanzotto
N. Mana
F. Pianesi
R. Delmonte
Proceedings of the COLING-2000 Workshop on Linguistically Interpreted Corpora
Customizable Modular Lexicalized Parsing
R. Basili
M. T. Pazienza
F. M. Zanzotto
Proceedings of the Sixth International Workshop on Parsing Technologies
Different NLP applications have different efficiency constraints (i.e. quality of the results and throughput) that reflect on each core linguistic component. Syntactic processors are basic modules in some NLP application. A customization that permits the performance control of these components enables their reuse in different application scenarios. Throughput has been commonly improved using partial syntactic processors. On the other hand, specialized lexicons are generally employed to improve the quality of the syntactic material produced by specific parsing (sub)process (e.g. verb argument detection or PP attachment disambiguation) . Building upon the idea of grammar stratification, in this paper a method to push modularity and lexical sensitivity, in parsing, in view of customizable syntactic analysers is presented. A framework for modular parser design is proposed and its main properties are discussed. Parsers (i.e. different parsing module chains) are then presented and their performances are analyzed in an application-driven scenarios.
Automatic Adaptation of WordNet to Sublanguages and to Computational Tasks
Roberto Basili
Alessandro Cucchiarelli
Carlo Consoli
Maria Teresa Pazienza
Paola Velardi
Usage of WordNet in Natural Language Processing Systems
Towards a Bootstrapping Framework for Corpus Semantic Tagging
Roberto Basili
Michelangelo Della Rocca
Maria Teresa Pazienza
Tagging Text with Lexical Semantics: Why, What, and How?
Inducing Terminology for Lexical Acquisition
Roberto Basili
Gianluca De Rossi
Maria Teresa Pazienza
Second Conference on Empirical Methods in Natural Language Processing
Integrating General-purpose and Corpus-based Verb Classification
Roberto Basili
Maria Teresa Pazienza
Paola Velardi
Computational Linguistics, Volume 22, Number 4, December 1996
Unsupervised Learning of Syntactic Knowledge: Methods and Measures
R. Basili
A. Marziali
M.T. Pazienza
P. Velardi
Conference on Empirical Methods in Natural Language Processing
Might a semantic lexicon support hypertextual authoring?
Roberto Basili
Fabrizio Grisoli
Maria Teresa Pazienza
Fourth Conference on Applied Natural Language Processing
A “not-so-shallow” parser for collocational analysis
R. Basili
M.T. Pazienza
P. Velardi
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics
The Noisy Channel and the Braying Donkey
Roberto Basili
Maria Teresa Pazienza
Paola Velardi
The Balancing Act: Combining Symbolic and Statistical Approaches to Language
Hierarchical Clustering of Verbs
Roberto Basili
Maria Pazienza
Paola Velardi
Acquisition of Lexical Knowledge from Text
Computational Lexicons: the Neat Examples and the Odd Exemplars
Roberto Basili
Maria Teresa Pazienza
Paola Velardi
Third Conference on Applied Natural Language Processing
How to Encode Semantic Knowledge: A Method for Meaning Representation and Computer-Aided Acquisition
Paola Velardi
Maria Teresa Pazienze
Michela Fasolo
Computational Linguistics, Volume 17, Number 2, June 1991
Computer Aided Interpretation of Lexical Cooccurrences
Paola Velardi
Maria Teresa Pazienza
27th Annual Meeting of the Association for Computational Linguistics
A Structured Representation of Word-Senses for Semantic Analysis.
Maria Teresa Pazienza
Paola Velardi
Third Conference of the European Chapter of the Association for Computational Linguistics