Matteo Dragoni


2014

pdf
A Gold Standard for CLIR evaluation in the Organic Agriculture Domain
Alessio Bosca | Matteo Casu | Matteo Dragoni | Nikolaos Marianos
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present a gold standard for the evaluation of Cross Language Information Retrieval systems in the domain of Organic Agriculture and AgroEcology. The presented resource is free to use for research purposes and it includes a collection of multilingual documents annotated with respect to a domain ontology, the ontology used for annotating the resources, a set of 48 queries in 12 languages and a gold standard with the correct resources for the proposed queries. The goal of this work consists in contributing to the research community with a resource for evaluating multilingual retrieval algorithms, with particular focus on domain adaptation strategies for “general purpose” multilingual information retrieval systems and on the effective exploitation of semantic annotations. Domain adaptation is in fact an important activity for tuning the retrieval system, reducing the ambiguities and improving the precision of information retrieval. Domain ontologies constitute a diffuse practice for defining the conceptual space of a corpus and mapping resources to specific topics and in our lab we propose as well to investigate and evaluate the impact of this information in enhancing the retrieval of contents. An initial experiment is described, giving a baseline for further research with the proposed gold standard.