Nicoletta Calzolari

Also published as: N. Calzolari, Nicoletta Calzolari Zamorani


2024

pdf bib
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Nicoletta Calzolari | Min-Yen Kan | Veronique Hoste | Alessandro Lenci | Sakriani Sakti | Nianwen Xue
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

pdf bib
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024): Tutorial Summaries
Roman Klinger | Naozaki Okazaki | Nicoletta Calzolari | Min-Yen Kan
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024): Tutorial Summaries

2022

pdf bib
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Nicoletta Calzolari | Frédéric Béchet | Philippe Blache | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Jan Odijk | Stelios Piperidis
Proceedings of the Thirteenth Language Resources and Evaluation Conference

pdf bib
Proceedings of the 29th International Conference on Computational Linguistics
Nicoletta Calzolari | Chu-Ren Huang | Hansaem Kim | James Pustejovsky | Leo Wanner | Key-Sun Choi | Pum-Mo Ryu | Hsin-Hsi Chen | Lucia Donatelli | Heng Ji | Sadao Kurohashi | Patrizia Paggio | Nianwen Xue | Seokhwan Kim | Younggyun Hahm | Zhong He | Tony Kyungil Lee | Enrico Santus | Francis Bond | Seung-Hoon Na
Proceedings of the 29th International Conference on Computational Linguistics

2020

pdf bib
Proceedings of the Twelfth Language Resources and Evaluation Conference
Nicoletta Calzolari | Frédéric Béchet | Philippe Blache | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Twelfth Language Resources and Evaluation Conference

pdf
A Shared Task of a New, Collaborative Type to Foster Reproducibility: A First Exercise in the Area of Language Science and Technology with REPROLANG2020
António Branco | Nicoletta Calzolari | Piek Vossen | Gertjan Van Noord | Dieter van Uytvanck | João Silva | Luís Gomes | André Moreira | Willem Elbers
Proceedings of the Twelfth Language Resources and Evaluation Conference

n this paper, we introduce a new type of shared task — which is collaborative rather than competitive — designed to support and fosterthe reproduction of research results. We also describe the first event running such a novel challenge, present the results obtained, discussthe lessons learned and ponder on future undertakings.

2018

bib
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Nicoletta Calzolari | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Koiti Hasida | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Asuncion Moreno | Jan Odijk | Stelios Piperidis | Takenobu Tokunaga
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf
LREMap, a Song of Resources and Evaluation
Riccardo Del Gratta | Sara Goggi | Gabriella Pardelli | Nicoletta Calzolari
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

bib
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Nicoletta Calzolari | Khalid Choukri | Thierry Declerck | Sara Goggi | Marko Grobelnik | Bente Maegaard | Joseph Mariani | Helene Mazo | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

pdf
LREC as a Graph: People and Resources in a Network
Riccardo Del Gratta | Francesca Frontini | Monica Monachini | Gabriella Pardelli | Irene Russo | Roberto Bartolini | Fahad Khan | Claudia Soria | Nicoletta Calzolari
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This proposal describes a new way to visualise resources in the LREMap, a community-built repository of language resource descriptions and uses. The LREMap is represented as a force-directed graph, where resources, papers and authors are nodes. The analysis of the visual representation of the underlying graph is used to study how the community gathers around LRs and how LRs are used in research.

pdf
New Developments in the LRE Map
Vladimir Popescu | Lin Liu | Riccardo Del Gratta | Khalid Choukri | Nicoletta Calzolari
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper we describe the new developments brought to LRE Map, especially in terms of the user interface of the Web application, of the searching of the information therein, and of the data model updates.

2014

bib
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Nicoletta Calzolari | Khalid Choukri | Thierry Declerck | Hrafn Loftsson | Bente Maegaard | Joseph Mariani | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

pdf
The Strategic Impact of META-NET on the Regional, National and International Level
Georg Rehm | Hans Uszkoreit | Sophia Ananiadou | Núria Bel | Audronė Bielevičienė | Lars Borin | António Branco | Gerhard Budin | Nicoletta Calzolari | Walter Daelemans | Radovan Garabík | Marko Grobelnik | Carmen García-Mateo | Josef van Genabith | Jan Hajič | Inma Hernáez | John Judge | Svetla Koeva | Simon Krek | Cvetana Krstev | Krister Lindén | Bernardo Magnini | Joseph Mariani | John McNaught | Maite Melero | Monica Monachini | Asunción Moreno | Jan Odijk | Maciej Ogrodniczuk | Piotr Pęzik | Stelios Piperidis | Adam Przepiórkowski | Eiríkur Rögnvaldsson | Michael Rosner | Bolette Pedersen | Inguna Skadiņa | Koenraad De Smedt | Marko Tadić | Paul Thompson | Dan Tufiş | Tamás Váradi | Andrejs Vasiļjevs | Kadri Vider | Jolanta Zabarskaite
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics. This paper documents the initiative’s work throughout Europe in order to boost progress and innovation in our field.

pdf
META-SHARE: One year after
Stelios Piperidis | Harris Papageorgiou | Christian Spurk | Georg Rehm | Khalid Choukri | Olivier Hamon | Nicoletta Calzolari | Riccardo del Gratta | Bernardo Magnini | Christian Girardi
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper presents META-SHARE (www.meta-share.eu), an open language resource infrastructure, and its usage since its Europe-wide deployment in early 2013. META-SHARE is a network of repositories that store language resources (data, tools and processing services) documented with high-quality metadata, aggregated in central inventories allowing for uniform search and access. META-SHARE was developed by META-NET (www.meta-net.eu) and aims to serve as an important component of a language technology marketplace for researchers, developers, professionals and industrial players, catering for the full development cycle of language technology, from research through to innovative products and services. The observed usage in its initial steps, the steadily increasing number of network nodes, resources, users, queries, views and downloads are all encouraging and considered as supportive of the choices made so far. In tandem, take-up activities like direct linking and processing of datasets by language processing services as well as metadata transformation to RDF are expected to open new avenues for data and resources linking and boost the organic growth of the infrastructure while facilitating language technology deployment by much wider research communities and industrial sectors.

2012

pdf bib
Proceedings of the 3rd Workshop on the People’s Web Meets NLP: Collaboratively Constructed Semantic Resources and their Applications to NLP
Iryna Gurevych | Nicoletta Calzolari Zamorani | Jungi Kim
Proceedings of the 3rd Workshop on the People’s Web Meets NLP: Collaboratively Constructed Semantic Resources and their Applications to NLP

bib
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Nicoletta Calzolari | Khalid Choukri | Thierry Declerck | Mehmet Uğur Doğan | Bente Maegaard | Joseph Mariani | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

pdf
The Language Library: supporting community effort for collective resource production
Riccardo Del Gratta | Francesca Frontini | Francesco Rubino | Irene Russo | Nicoletta Calzolari
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Relations among phenomena at different linguistic levels are at the essence of language properties but today we focus mostly on one specific linguistic layer at a time, without (having the possibility of) paying attention to the relations among the different layers. At the same time our efforts are too much scattered without much possibility of exploiting other people's achievements. To address the complexities hidden in multilayer interrelations even small amounts of processed data can be useful, improving the performance of complex systems. Exploiting the current trend towards sharing we want to initiate a collective movement that works towards creating synergies and harmonisation among different annotation efforts that are now dispersed. In this paper we present the general architecture of the Language Library, an initiative which is conceived as a facility for gathering and making available through simple functionalities the linguistic knowledge the field is able to produce, putting in place new ways of collaboration within the LRT community. In order to reach this goal, a first population round of the Language Library has started around a core of parallel/comparable texts that have been annotated by several contributors submitting a paper for LREC2012. The Language Library has also an ancillary aim related to language documentation and archiving and it is conceived as a theory-neutral space which allows for several language processing philosophies to coexist.

pdf
The LRE Map. Harmonising Community Descriptions of Resources
Nicoletta Calzolari | Riccardo Del Gratta | Gil Francopoulo | Joseph Mariani | Francesco Rubino | Irene Russo | Claudia Soria
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Accurate and reliable documentation of Language Resources is an undisputable need: documentation is the gateway to discovery of Language Resources, a necessary step towards promoting the data economy. Language resources that are not documented virtually do not exist: for this reason every initiative able to collect and harmonise metadata about resources represents a valuable opportunity for the NLP community. In this paper we describe the LRE Map, reporting statistics on resources associated with LREC2012 papers and providing comparisons with LREC2010 data. The LRE Map, jointly launched by FLaReNet and ELRA in conjunction with the LREC 2010 Conference, is an instrument for enhancing availability of information about resources, either new or already existing ones. It wants to reinforce and facilitate the use of standards in the community. The LRE Map web interface provides the possibility of searching according to a fixed set of metadata and to view the details of extracted resources. The LRE Map is continuing to collect bottom-up input about resources from authors of other conferences through standard submission process. This will help broadening the notion of “language resources” and attract to the field neighboring disciplines that so far have been only marginally involved by the standard notion of language resources.

pdf
The FLaReNet Strategic Language Resource Agenda
Claudia Soria | Núria Bel | Khalid Choukri | Joseph Mariani | Monica Monachini | Jan Odijk | Stelios Piperidis | Valeria Quochi | Nicoletta Calzolari
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The FLaReNet Strategic Agenda highlights the most pressing needs for the sector of Language Resources and Technologies and presents a set of recommendations for its development and progress in Europe, as issued from a three-year consultation of the FLaReNet European project. The FLaReNet recommendations are organised around nine dimensions: a) documentation b) interoperability c) availability, sharing and distribution d) coverage, quality and adequacy e) sustainability f) recognition g) development h) infrastructure and i) international cooperation. As such, they cover a broad range of topics and activities, spanning over production and use of language resources, licensing, maintenance and preservation issues, infrastructures for language resources, resource identification and sharing, evaluation and validation, interoperability and policy issues. The intended recipients belong to a large set of players and stakeholders in Language Resources and Technology, ranging from individuals to research and education institutions, to policy-makers, funding agencies, SMEs and large companies, service and media providers. The main goal of these recommendations is to serve as an instrument to support stakeholders in planning for and addressing the urgencies of the Language Resources and Technologies of the future.

2011

pdf bib
Proceedings of the Workshop on Language Resources, Technology and Services in the Sharing Paradigm
Nicoletta Calzolari | Toru Ishida | Stelios Piperidis | Virach Sornlertlamvanich
Proceedings of the Workshop on Language Resources, Technology and Services in the Sharing Paradigm

pdf
Interoperability Framework: The FLaReNet Action Plan Proposal
Nicoletta Calzolari | Monica Monachini | Valeria Quochi
Proceedings of the Workshop on Language Resources, Technology and Services in the Sharing Paradigm

pdf
The Language Library: Many Layers, More Knowledge
Nicoletta Calzolari | Riccardo Del Gratta | Francesca Frontini | Irene Russo
Proceedings of the Workshop on Language Resources, Technology and Services in the Sharing Paradigm

2010

bib
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Nicoletta Calzolari | Khalid Choukri | Bente Maegaard | Joseph Mariani | Jan Odijk | Stelios Piperidis | Mike Rosner | Daniel Tapias
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

pdf
Preparing the field for an Open Resource Infrastructure: the role of the FLaReNet Network of Excellence
Nicoletta Calzolari | Claudia Soria
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In order to overcome the fragmentation that affects the field of Language Resources and Technologies, an Open and Distributed Resource Infrastructure is the necessary step for building on each other achievements, integrating resources and technologies and avoiding dispersed or conflicting efforts. Since this endeavour represents a true cultural turnpoint in the LRs field, it needs a careful preparation, both in terms of acceptance by the community and thoughtful investigation of the various technical, organisational and practical aspects implied. To achieve this, we need to act as a community able to join forces on a set of shared priorities and we need to act at a worldwide level. FLaReNet ― Fostering Language Resources Network ― is a Thematic Network funded under the EU eContent program that aims at developing the needed common vision and fostering a European and International strategy for consolidating the sector, thus enhancing competitiveness at EU level and worldwide. In this paper we present the activities undertaken by FLaReNet in order to prepare and support the establishment of such an Infrastructure, which is becoming now a reality within the new MetaNet initiative.

pdf
The LREC Map of Language Resources and Technologies
Nicoletta Calzolari | Claudia Soria | Riccardo Del Gratta | Sara Goggi | Valeria Quochi | Irene Russo | Khalid Choukri | Joseph Mariani | Stelios Piperidis
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper we present the LREC Map of Language Resources and Tools, an innovative feature introduced with this LREC. The purpose of the Map is to shed light on the vast amount of resources and tools that represent the background of the research presented at LREC, in the attempt to fill in a gap in the community knowledge about the resources and tools that are used or created worldwide. It also aims at a change of culture in the field, actively engaging each researcher in the documentation task about resources. The Map has been developed on the basis of the information provided by LREC authors during the submission of papers to the LREC 2010 conference and the LREC workshops, and contains information about almost 2000 resources. The paper illustrates the motivation behind this initiative, its main characteristics, its relevance and future impact in the field, the metadata used to describe the resources, and finally presents some of the most relevant findings.

pdf
An LMF-based Web Service for Accessing WordNet-type Semantic Lexicons
Bora Savas | Yoshihiko Hayashi | Monica Monachini | Claudia Soria | Nicoletta Calzolari
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper describes a Web service for accessing WordNet-type semantic lexicons. The central idea behind the service design is: given a query, the primary functionality of lexicon access is to present a partial lexicon by extracting the relevant part of the target lexicon. Based on this idea, we implemented the system as a RESTful Web service whose input query is specified by the access URI and whose output is presented in a standardized XML data format. LMF, an ISO standard for modeling lexicons, plays the most prominent role: the access URI pattern basically reflects the lexicon structure as defined by LMF; the access results are rendered based on Wordnet-LMF, which is a version of LMF XML-serialization. The Web service currently provides accesses to Princeton WordNet, Japanese WordNet, as well as the EDR Electronic Dictionary as a trial. To accommodate the EDR dictionary within the same framework, we modeled it also as a WordNet-type semantic lexicon. This paper thus argues possible alternatives to model innately bilingual/multilingual lexicons like EDR with LMF, and proposes possible revisions to Wordnet-LMF.

pdf
Resource and Service Centres as the Backbone for a Sustainable Service Infrastructure
Peter Wittenburg | Nuria Bel | Lars Borin | Gerhard Budin | Nicoletta Calzolari | Eva Hajicova | Kimmo Koskenniemi | Lothar Lemnitzer | Bente Maegaard | Maciej Piasecki | Jean-Marie Pierrel | Stelios Piperidis | Inguna Skadina | Dan Tufis | Remco van Veenendaal | Tamas Váradi | Martin Wynne
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Currently, research infrastructures are being designed and established in many disciplines since they all suffer from an enormous fragmentation of their resources and tools. In the domain of language resources and tools the CLARIN initiative has been funded since 2008 to overcome many of the integration and interoperability hurdles. CLARIN can build on knowledge and work from many projects that were carried out during the last years and wants to build stable and robust services that can be used by researchers. Here service centres will play an important role that have the potential of being persistent and that adhere to criteria as they have been established by CLARIN. In the last year of the so-called preparatory phase these centres are currently developing four use cases that can demonstrate how the various pillars CLARIN has been working on can be integrated. All four use cases fulfil the criteria of being cross-national.

pdf
A Road Map for Interoperable Language Resource Metadata
Christopher Cieri | Khalid Choukri | Nicoletta Calzolari | D. Terence Langendoen | Johannes Leveling | Martha Palmer | Nancy Ide | James Pustejovsky
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

LRs remain expensive to create and thus rare relative to demand across languages and technology types. The accidental re-creation of an LR that already exists is a nearly unforgivable waste of scarce resources that is unfortunately not so easy to avoid. The number of catalogs the HLT researcher must search, with their different formats, make it possible to overlook an existing resource. This paper sketches the sources of this problem and outlines a proposal to rectify along with a new vision of LR cataloging that will to facilitates the documentation and exploitation of a much wider range of LRs than previously considered.

2009

pdf
The SILT and FlaReNet International Collaboration for Interoperability
Nancy Ide | James Pustejovsky | Nicoletta Calzolari | Claudia Soria
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

pdf
Query Expansion using LMF-Compliant Lexical Resources
Takenobu Tokunaga | Dain Kaplan | Nicoletta Calzolari | Monica Monachini | Claudia Soria | Virach Sornlertlamvanich | Thatsanee Charoenporn | Yingju Xia | Chu-Ren Huang | Shu-Kai Hsieh | Kiyoaki Shirai
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)

pdf
The FLaReNet Thematic Network: A Global Forum for Cooperation
Nicoletta Calzolari | Claudia Soria
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)

2008

bib
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Nicoletta Calzolari | Khalid Choukri | Bente Maegaard | Joseph Mariani | Jan Odijk | Stelios Piperidis | Daniel Tapias
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

pdf
A lexicon for biology and bioinformatics: the BOOTStrep experience.
Valeria Quochi | Monica Monachini | Riccardo Del Gratta | Nicoletta Calzolari
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper describes the design, implementation and population of a lexical resource for biology and bioinformatics (the BioLexicon) developed within an ongoing European project. The aim of this project is text-based knowledge harvesting for support to information extraction and text mining in the biomedical domain. The BioLexicon is a large-scale lexical-terminological resource encoding different information types in one single integrated resource. In the design of the resource we follow the ISO/DIS 24613 “Lexical Mark-up Framework” standard, which ensures reusability of the information encoded and easy exchange of both data and architecture. The design of the resource also takes into account the needs of our text mining partners who automatically extract syntactic and semantic information from texts and feed it to the lexicon. The present contribution first describes in detail the model of the BioLexicon along its three main layers: morphology, syntax and semantics; then, it briefly describes the database implementation of the model and the population strategy followed within the project, together with an example. The BioLexicon database in fact comes equipped with automatic uploading procedures based on a common exchange XML format, which guarantees that the lexicon can be properly populated with data coming from different sources.

pdf
KYOTO: a System for Mining, Structuring and Distributing Knowledge across Languages and Cultures
Piek Vossen | Eneko Agirre | Nicoletta Calzolari | Christiane Fellbaum | Shu-kai Hsieh | Chu-Ren Huang | Hitoshi Isahara | Kyoko Kanzaki | Andrea Marchetti | Monica Monachini | Federico Neri | Remo Raffaelli | German Rigau | Maurizio Tescon | Joop VanGent
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We outline work performed within the framework of a current EC project. The goal is to construct a language-independent information system for a specific domain (environment/ecology/biodiversity) anchored in a language-independent ontology that is linked to wordnets in seven languages. For each language, information extraction and identification of lexicalized concepts with ontological entries is carried out by text miners (“Kybots”). The mapping of language-specific lexemes to the ontology allows for crosslinguistic identification and translation of equivalent terms. The infrastructure developed within this project enables long-range knowledge sharing and transfer across many languages and cultures, addressing the need for global and uniform transition of knowledge beyond the specific domains addressed here.

pdf
Evaluation of Natural Language Tools for Italian: EVALITA 2007
Bernardo Magnini | Amedeo Cappelli | Fabio Tamburini | Cristina Bosco | Alessandro Mazzei | Vincenzo Lombardo | Francesca Bertagna | Nicoletta Calzolari | Antonio Toral | Valentina Bartalesi Lenzi | Rachele Sprugnoli | Manuela Speranza
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

EVALITA 2007, the first edition of the initiative devoted to the evaluation of Natural Language Processing tools for Italian, provided a shared framework where participants’ systems had the possibility to be evaluated on five different tasks, namely Part of Speech Tagging (organised by the University of Bologna), Parsing (organised by the University of Torino), Word Sense Disambiguation (organised by CNR-ILC, Pisa), Temporal Expression Recognition and Normalization (organised by CELCT, Trento), and Named Entity Recognition (organised by FBK, Trento). We believe that the diffusion of shared tasks and shared evaluation practices is a crucial step towards the development of resources and tools for Natural Language Processing. Experiences of this kind, in fact, are a valuable contribution to the validation of existing models and data, allowing for consistent comparisons among approaches and among representation schemes. The good response obtained by EVALITA, both in the number of participants and in the quality of results, showed that pursuing such goals is feasible not only for English, but also for other languages.

pdf
Foundation of a Component-based Flexible Registry for Language Resources and Technology
Daan Broeder | Thierry Declerck | Erhard Hinrichs | Stelios Piperidis | Laurent Romary | Nicoletta Calzolari | Peter Wittenburg
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Within the CLARIN e-science infrastructure project it is foreseen to develop a component-based registry for metadata for Language Resources and Language Technology. With this registry it is hoped to overcome the problems of the current available systems with respect to inflexible fixed schema, unsuitable terminology and interoperability problems. The registry will address interoperability needs by refering to a shared vocabulary registered in data category registries as they are suggested by ISO.

pdf
Adapting International Standard for Asian Language Technologies
Takenobu Tokunaga | Dain Kaplan | Chu-Ren Huang | Shu-Kai Hsieh | Nicoletta Calzolari | Monica Monachini | Claudia Soria | Kiyoaki Shirai | Virach Sornlertlamvanich | Thatsanee Charoenporn | YingJu Xia
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Corpus-based approaches and statistical approaches have been the main stream of natural language processing research for the past two decades. Language resources play a key role in such approaches, but there is an insufficient amount of language resources in many Asian languages. In this situation, standardisation of language resources would be of great help in developing resources in new languages. This paper presents the latest development efforts of our project which aims at creating a common standard for Asian language resources that is compatible with an international standard. In particular, the paper focuses on i) lexical specification and data categories relevant for building multilingual lexical resources for Asian languages; ii) a core upper-layer ontology needed for ensuring multilingual interoperability and iii) the evaluation platform used to test the entire architectural framework.

pdf
UFRA: a UIMA-based Approach to Federated Language Resource Architecture
Riccardo Del Gratta | Roberto Bartolini | Tommaso Caselli | Monica Monachini | Claudia Soria | Nicoletta Calzolari
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we address the issue of developing an interoperable infrastructure for language resources and technologies. In our approach, called UFRA, we extend the Federate Database Architecture System adding typical functionalities caming from UIMA. In this way, we capitalize the advantages of a federated architecture, such as autonomy, heterogeneity and distribution of components, monitored by a central authority responsible for checking both the integration of components and user rights on performing different tasks. We use the UIMA approach to manage and define one common front-end, enabling users and clients to query, retrieve and use language resources and technologies. The purpose of this paper is to show how UIMA leads from a Federated Database Architecture to a Federated Resource Architecture, adding to a registry of available components both static resources such as lexicons and corpora and dynamic ones such as tools and general purpose language technologies. At the end of the paper, we present a case-study that adopts this framework to integrate the SIMPLE lexicon and TIMEML annotation guidelines to tag natural language texts.

pdf
Ontologizing Lexicon Access Functions based on an LMF-based Lexicon Taxonomy
Yoshihiko Hayashi | Chiharu Narawa | Monica Monachini | Claudia Soria | Nicoletta Calzolari
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper discusses ontologization of lexicon access functions in the context of a service-oriented language infrastructure, such as the Language Grid. In such a language infrastructure, an access function to a lexical resource, embodied as an atomic Web service, plays a crucially important role in composing a composite Web service tailored to a user’s specific requirement. To facilitate the composition process involving service discovery, planning and invocation, the language infrastructure should be ontology-based; hence the ontologization of a range of lexicon functions is highly required. In a service-oriented environment, lexical resources however can be classified from a service-oriented perspective rather than from a lexicographically motivated standard. Hence to address the issue of interoperability, the taxonomy for lexical resources should be ground to principled and shared lexicon ontology. To do this, we have ontologized the standardized lexicon modeling framework LMF, and utilized it as a foundation to stipulate the service-oriented lexicon taxonomy and the corresponding ontology for lexicon access functions. This paper also examines a possible solution to fill the gap between the ontological descriptions and the actual Web service API by adopting a W3C recommendation SAWSDL, with which Web service descriptions can be linked with the domain ontology.

2006

bib
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Nicoletta Calzolari | Khalid Choukri | Aldo Gangemi | Bente Maegaard | Joseph Mariani | Jan Odijk | Daniel Tapias
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

pdf
Unified Lexicon and Unified Morphosyntactic Specifications for Written and Spoken Italian
Monica Monachini | Nicoletta Calzolari | Khalid Choukri | Jochen Friedrich | Giulio Maltese | Michele Mammini | Jan Odijk | Marisa Ulivieri
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The goal of this paper is (1) to illustrate a specific procedure for merging different monolingual lexicons, focussing on techniques for detecting and mapping equivalent lexical entries, and (2) to sketch a production model that enables one to obtain lexical resources via unification of existing data. We describe the creation of a Unified Lexicon (UL) from a common sample of the Italian PAROLE-SIMPLE-CLIPS phonological lexicon and of the Italian LCSTAR pronunciation lexicon. We expand previous experiments carried out at ILC-CNR: based on a detailed mechanism for mapping grammatical classifications of candidate UL entries, a consensual set of Unified Morphosyntactic Specifications (UMS) shared by lexica for the written and spoken areas is proposed. The impact of the UL on cross-validation issues is analysed: by looking into conflicts, mismatches and diverging classifications can be detected in both resources. The work presented is in line with the activities promoted by ELRA towards the development of methods for packaging new language resources by combining independently created resources, and was carried out as part of the ELRA Production Committee activities. ELRA aims to exploit the UL experience to carry out such merging activities for resources available on the ELRA catalogue in order to fulfill the users' needs.

pdf
Moving to dynamic computational lexicons with LeXFlow
Claudia Soria | Maurizio Tesconi | Francesca Bertagna | Nicoletta Calzolari | Andrea Marchetti | Monica Monachini
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we present LeXFlow, a web application framework where lexicons already expressed in standardised format semi-automatically interact by reciprocally enriching themselves. LeXFlow is intended for, on the one hand, paving the way to the development of dynamic multi-source lexicons; and on the other, for fostering the adoption of standards. Borrowing from techniques used in the domain of document workflows, we model the activity of lexicon management as a particular case of workflow instance, where lexical entries move across agents and become dynamically updated. To this end, we have designed a lexical flow (LF) corresponding to the scenario where an entry of a lexicon A becomes enriched via basically two steps. First, by virtue of being mapped onto a corresponding entry belonging to a lexicon B, the entry(LA) inherits the semantic relations available in lexicon B. Second, by resorting to an automatic application that acquires information about semantic relations from corpora, the relations acquired are integrated into the entry and proposed to the human encoder. As a result of the lexical flow, in addition, for each starting lexical entry(LA) mapped onto a corresponding entry(LB) the flow produces a new entry representing the merging of the original two.

pdf
Language Resources Production Models: the Case of the INTERA Multilingual Corpus and Terminology
Maria Gavrilidou | Penny Labropoulou | Stelios Piperidis | Voula Giouli | Nicoletta Calzolari | Monica Monachini | Claudia Soria | Khalid Choukri
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper reports on the multilingual Language Resources (MLRs), i.e. parallel corpora and terminological lexicons for less widely digitally available languages, that have been developed in the INTERA project and the methodology adopted for their production. Special emphasis is given to the reality factors that have influenced the MLRs development approach and their final constitution. Building on the experience gained in the project, a production model has been elaborated, suggesting ways and techniques that can be exploited in order to improve LRs production taking into account realistic issues.

pdf
Lexical Markup Framework (LMF)
Gil Francopoulo | Monte George | Nicoletta Calzolari | Monica Monachini | Nuria Bel | Mandy Pet | Claudia Soria
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Optimizing the production, maintenance and extension of lexical resources is one the crucial aspects impacting Natural Language Processing (NLP). A second aspect involves optimizing the process leading to their integration in applications. With this respect, we believe that the production of a consensual specification on lexicons can be a useful aid for the various NLP actors. Within ISO, the purpose of LMF is to define a standard for lexicons. LMF is a model that provides a common standardized framework for the construction of NLP lexicons. The goals of LMF are to provide a common model for the creation and use of lexical resources, to manage the exchange of data between and among these resources, and to enable the merging of large number of individual electronic resources to form extensive global electronic resources. In this paper, we describe the work in progress within the sub-group ISO-TC37/SC4/WG4. Various experts from a lot of countries have been consulted in order to take into account best practices in a lot of languages for (we hope) all kinds of NLP lexicons.

pdf
Next Generation Language Resources using Grid
Federico Calzolari | Eva Sassolini | Manuela Sassi | Sebastiana Cucurullo | Eugenio Picchi | Francesca Bertagna | Alessandro Enea | Monica Monachini | Claudia Soria | Nicoletta Calzolari
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper presents a case study concerning the challenges and requirements posed by next generation language resources, realized as an overall model of open, distributed and collaborative language infrastructure. If a sort of “new paradigm” for language resource sharing is required, we think that the emerging and still evolving technology connected to Grid computing is a very interesting and suitable one for a concrete realization of this vision. Given the current limitations of Grid computing, it is very important to test the new environment on basic language analysis tools, in order to get the feeling of what are the potentialities and possible limitations connected to its use in NLP. For this reason, we have done some experiments on a module of the Linguistic Miner, i.e. the extraction of linguistic patterns from restricted domain corpora. The Grid environment has produced the expected results (reduction of the processing time, huge storage capacity, data redundancy) without any additional cost for the final user.

pdf bib
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
Nicoletta Calzolari | Claire Cardie | Pierre Isabelle
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf
Infrastructure for Standardization of Asian Language Resources
Takenobu Tokunaga | Virach Sornlertlamvanich | Thatsanee Charoenporn | Nicoletta Calzolari | Monica Monachini | Claudia Soria | Chu-Ren Huang | YingJu Xia | Hao Yu | Laurent Prevot | Kiyoaki Shirai
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf
LeXFlow: A System for Cross-Fertilization of Computational Lexicons
Maurizio Tesconi | Andrea Marchetti | Francesca Bertagna | Monica Monachini | Claudia Soria | Nicoletta Calzolari
Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions

pdf bib
Lexical Markup Framework (LMF) for NLP Multilingual Resources
Gil Francopoulo | Nuria Bel | Monte George | Nicoletta Calzolari | Monica Monachini | Mandy Pet | Claudia Soria
Proceedings of the Workshop on Multilingual Language Resources and Interoperability

pdf
Towards Agent-based Cross-Lingual Interoperability of Distributed Lexical Resources
Claudia Soria | Maurizio Tesconi | Andrea Marchetti | Francesca Bertagna | Monica Monachini | Chu-Ren Huang | Nicoletta Calzolari
Proceedings of the Workshop on Multilingual Language Resources and Interoperability

2004

pdf
Senseval-3: The Italian all-words task
Marisa Ulivieri | Elisabetta Guazzini | Francesca Bertagna | Nicoletta Calzolari
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

pdf
Towards a Language Infrastructure for the Semantic Web
Thierry Declerck | Paul Buitelaar | Nicoletta Calzolari | Alessandro Lenci
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf
ENABLER Thematic Network of National Projects: Technical, Strategic and Political Issues of LRs
Nicoletta Calzolari | Khalid Choukri | Maria Gavrilidou | Bente Maegaard | Paola Baroni | Hanne Fersøe | Alessandro Lenci | Valérie Mapelli | Monica Monachini | Stelios Piperidis
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf
Content Interoperability of Lexical Resources: Open Issues and “MILE” Perspectives
Francesca Bertagna | Alessandro Lenci | Monica Monachini | Nicoletta Calzolari
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf
RDF Instantiation of ISLE/MILE Lexical Entries
Nancy Ide | Alessandro Lenci | Nicoletta Calzolari
Proceedings of the ACL 2003 Workshop on Linguistic Annotation: Getting the Model Right

2002

pdf
Broadening the Scope of the EAGLES/ISLE Lexical Standardization Initiative
Nicoletta Calzolari | Alessandro Lenci | Francesca Bertagna | Antonio Zampolli
COLING-02: The 3rd Workshop on Asian Language Resources and International Standardization

pdf
CLIPS, a Multi-level Italian Computational Lexicon: a Glimpse to Data
Nilda Ruimy | Monica Monachini | Raffaella Distante | Elisabetta Guazzini | Stefano Molino | Marisa Ulivieri | Nicoletta Calzolari | Antonio Zampolli
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Integrating Two Semantic Lexicons, SIMPLE and ItalWordNet: What Can We Gain?
Adriana Roventini | Marisa Ulivieri | Nicoletta Calzolari
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Standards & best practice for multilingual computational lexicons: ISLE MILE and more”
Nicoletta Calzolari | Ralph Grishman | Martha Palmer
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
From Resources to Applications. Designing the Multilingual ISLE Lexical Entry
Sue Atkins | Nuria Bel | Francesca Bertagna | Pierrette Bouillon | Nicoletta Calzolari | Christiane Fellbaum | Ralph Grishman | Alessandro Lenci | Catherine MacLeod | Martha Palmer | Gregor Thurmair | Marta Villegas | Antonio Zampolli
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Towards Best Practice for Multiword Expressions in Computational Lexicons
Nicoletta Calzolari | Charles J. Fillmore | Ralph Grishman | Nancy Ide | Alessandro Lenci | Catherine MacLeod | Antonio Zampolli
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Multilingual Summarization by Integrating Linguistic Resources in the MLIS-MUSI Project
Alessandro Lenci | Roberto Bartolini | Nicoletta Calzolari | Ana Agua | Stephan Busemann | Emmanuel Cartier | Karine Chevreau | José Coch
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2001

pdf
International Standards for Multilingual Resource Sharing: The ISLE Computational Lexicon Working Group
Nicoletta Calzolari | Alessandro Lenci | Antonio Zampolli
Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources

pdf
The ISLE in the ocean. Transatlantic standards for multilingual lexicons (with an eye to machine translation)
Nicoletta Calzolari | Alessandro Lenci | Antonio Zampolli | Nuria Bel | Marta Villegas | Gregor Thurmair
Proceedings of Machine Translation Summit VIII

The ISLE project is a continuation of the long standing EAGLES initiative, carried out under the Human Language Technology (HLT) programme in collaboration between American and European groups in the framework of the EU-US International Research Co-operation, supported by NSF and EC. In this paper we concentrate on the current position of the ISLE Computational Lexicon Working Group (CLWG), whose activities aim at defining a general schema for a multilingual lexical entry (MILE), as the basis for a standard framework for multilingual computational lexicons. The needs and features of existing Machine Translation systems provide the main reference points for the process of consensual definition of the MILE. The overall structure of the MILE will be illustrated with particular attention to some of the issues raised for multilingual lexicons by the need of expressing complex transfer conditions among translation equivalents

pdf
The Italian Lexical Sample Task
Francesca Bertagna | Claudia Soria | Nicoletta Calzolari
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

2000

pdf
The Italian Syntactic-Semantic Treebank: Architecture, Annotation, Tools and Evaluation
S. Montemagni | F. Barsotti | M. Battista | N. Calzolari | O. Corazzari | A. Zampolli | F. Fanciulli | M. Massetani | R. Raffaelli | R. Basili | M. T. Pazienza | D. Saracino | F. Zanzotto | N. Mana | F. Pianesi | R. Delmonte
Proceedings of the COLING-2000 Workshop on Linguistically Interpreted Corpora

pdf
Encoding information on adjectives in a lexical-semantic net for computational applications
Antonietta Alonge | Francesca Bertagna | Nicoletta Calzolari | Adriana Roventini | Antonio Zampolli
1st Meeting of the North American Chapter of the Association for Computational Linguistics

pdf
An Experiment of Lexical-Semantic Tagging of an Italian Corpus
Ornella Corazzari | Nicoletta Calzolari | Antonio Zampolli
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf
SIMPLE: A General Framework for the Development of Multilingual Lexicons
Nuria Bel | Federica Busa | Nicoletta Calzolari | Elisabetta Gola | Alessandro Lenci | Monica Monachini | Antoine Ogonowski | Ivonne Peters | Wim Peters | Nilda Ruimy | Marta Villegas | Antonio Zampolli
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf
Multilingual Linguistic Resources: From Monolingual Lexicons to Bilingual Interrelated Lexicons
Marta Villegas | Nuria Bel | Alessandro Lenci | Nicoletta Calzolari | Nilda Ruimy | Antonio Zampolli | Teresa Sadurní | Joan Soler
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf
ItalWordNet: a Large Semantic Database for Italian
Adriana Roventini | Antonietta Alonge | Nicoletta Calzolari | Bernardo Magnini | Francesca Bertagna
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

1999

pdf
Harmonised large-scale syntactic/semantic lexicons: a European multilingual infrastructure
Nicoletta Calzolari | Antonio Zampolli
Proceedings of Machine Translation Summit VII

The paper aims at providing an overview of the situation of Language Resources (LR) in Europe, in particular as emerging from a few European projects regarding the construction of large-scale harmonised resources to be used for many applicative purpose, also of multilingual nature. An important research aspect of the projects is given by the very fact that the large enterprise described is, at our knowledge, the first attempt at developing wide-coverage lexicons for so many languages (12 European languages), with a harmonised common model, and with encoding of structured "semantic types" and semantic (subcategorisation) frames on a large scale. Reaching a common agreed model grounded on sound theoretical approaches within a very large consortium is in itself a challenging task. The actual lexicons will then provide a framework for testing and evaluating the maturity of the current state-of-the-art in lexical semantics grounded on, and connected to. a syntactic foundation. Another research aspect is provided by the recognition of the necessity of accompanying these "static" lexicons with dynamic means of acquiring lexical information from large corpora. This is one of the challenging research aspects of a global strategy for building a large and useful multilingual LR infrastructure.

1993


European efforts towards standardizing language resources
Nicoletta Calzolari
Third International EAMT Workshop: Machine Translation and the Lexicon

This paper aims at providing a broad overview of the situation in Europe during the past few years, regarding efforts and concerted actions towards the standardization of large language resources, with particular emphasis on what is taking place in the fields of Computational Lexicons and Text Corpora. Attention will be focused on the plans, work in progress, and a few preliminary results of the LRE project EAGLES (Expert Advisory Group on Language Engineering Standards).

1991

pdf
Acquiring and representing semantic information in a Lexical Knowledge Base
Nicoletta Calzolari
Lexical Semantics and Knowledge Representation

1990

pdf
Acquisition of Lexical Information from a Large Textual Italian Corpus
Nicoletta Calzolari | Remo Bindi
COLING 1990 Volume 3: Papers presented to the 13th International Conference on Computational Linguistics

1989

pdf
Book Reviews: Medical Language Processing: Computer Management of Narrative Data
Nicoletta Calzolari
Computational Linguistics, Volume 15, Number 3, September 1989

1988

pdf
Acquisition of Semantic Information From an On-Line Dictionary
Nicoletta Calzolari | Eugenio Picchi
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics

1987

pdf
Tools and Methods for Computational Linguistics
Roy J. Byrd | Nicoletta Calzolari | Martin S. Chodorow | Judith L. Klavans | Mary S. Neff | Omneya A. Rizk
Computational Linguistics, Formerly the American Journal of Computational Linguistics, Volume 13, Numbers 3-4, July-December 1987

1984

pdf
Detecting Patterns in a Lexical Data Base
Nicoletta Calzolari
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics

pdf
Machine-Readable Dictionaries, Lexical Data Bases and the Lexical System
Nicoletta Calzolari
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics

1982

pdf
Towards the Organization of Lexical Definitions on a Database Structure
Nicoletta Calzolari
Coling 1982 Abstracts: Proceedings of the Ninth International Conference on Computational Linguistics Abstracts

1973

pdf
Working on the Italian Machine Dictionary: A Semantic Approach
Nicoletta Calzolari | Laura Pecchia | Antonio Zampolli
COLING 1973 Volume 2: Computational And Mathematical Linguistics: Proceedings of the International Conference on Computational Linguistics

Search
Co-authors
Venues