Vincenzo Guerrisi


Evaluating Multi-focus Natural Language Queries over Data Services
Silvia Quarteroni | Vincenzo Guerrisi | Pietro La Torre
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Natural language interfaces to data services will be a key technology to guarantee access to huge data repositories in an effortless way. This involves solving the complex problem of recognizing a relevant service or service composition given an ambiguous, potentially ungrammatical natural language question. As a first step toward this goal, we study methods for identifying the salient terms (or foci) in natural language questions, classifying the latter according to a taxonomy of services and extracting additional relevant information in order to route them to suitable data services. While current approaches deal with single-focus (and therefore single-domain) questions, we investigate multi-focus questions in the aim of supporting conjunctive queries over the data services they refer to. Since such complex queries have seldom been studied in the literature, we have collected an ad-hoc dataset, SeCo-600, containing 600 multi-domain queries annotated with a number of linguistic and pragmatic features. Our experiments with the dataset have allowed us to reach very high accuracy in different phases of query analysis, especially when adopting machine learning methods.