Mauro Dragoni

2024

pdf abs
Building Certified Medical Chatbots: Overcoming Unstructured Data Limitations with Modular RAG
Leonardo Sanna | Patrizio Bellan | Simone Magnolini | Marina Segala | Saba Ghanbari Haez | Monica Consolandi | Mauro Dragoni
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024

Creating a certified conversational agent poses several issues. The need to manage fine-grained information delivery and the necessity to provide reliable medical information requires a notable effort, especially in dataset preparation. In this paper, we investigate the challenges of building a certified medical chatbot in Italian that provides information about pregnancy and early childhood. We show some negative initial results regarding the possibility of creating a certified conversational agent within the RASA framework starting from unstructured data. Finally, we propose a modular RAG model to implement a Large Language Model in a certified context, overcoming data limitations and enabling data collection on actual conversations.

2020

pdf abs
MTSI-BERT: A Session-aware Knowledge-based Conversational Agent
Matteo Antonio Senese | Giuseppe Rizzo | Mauro Dragoni | Maurizio Morisio
Proceedings of the Twelfth Language Resources and Evaluation Conference

In the last years, the state of the art of NLP research has made a huge step forward. Since the release of ELMo (Peters et al., 2018), a new race for the leading scoreboards of all the main linguistic tasks has begun. Several models have been published achieving promising results in all the major NLP applications, from question answering to text classification, passing through named entity recognition. These great research discoveries coincide with an increasing trend for voice-based technologies in the customer care market. One of the next biggest challenges in this scenario will be the handling of multi-turn conversations, a type of conversations that differs from single-turn by the presence of multiple related interactions. The proposed work is an attempt to exploit one of these new milestones to handle multi-turn conversations. MTSI-BERT is a BERT-based model achieving promising results in intent classification, knowledge base action prediction and end of dialogue session detection, to determine the right moment to fulfill the user request. The study about the realization of PuffBot, an intelligent chatbot to support and monitor people suffering from asthma, shows how this type of technique could be an important piece in the development of future chatbots.

2018

pdf abs
NEUROSENT-PDI at SemEval-2018 Task 1: Leveraging a Multi-Domain Sentiment Model for Inferring Polarity in Micro-blog Text
Mauro Dragoni
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describes the NeuroSent system that participated in SemEval 2018 Task 1. Our system takes a supervised approach that builds on neural networks and word embeddings. Word embeddings were built by starting from a repository of user generated reviews. Thus, they are specific for sentiment analysis tasks. Then, tweets are converted in the corresponding vector representation and given as input to the neural network with the aim of learning the different semantics contained in each emotion taken into account by the SemEval task. The output layer has been adapted based on the characteristics of each subtask. Preliminary results obtained on the provided training set are encouraging for pursuing the investigation into this direction.

pdf abs
NEUROSENT-PDI at SemEval-2018 Task 3: Understanding Irony in Social Networks Through a Multi-Domain Sentiment Model
Mauro Dragoni
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describes the NeuroSent system that participated in SemEval 2018 Task 3. Our system takes a supervised approach that builds on neural networks and word embeddings. Word embeddings were built by starting from a repository of user generated reviews. Thus, they are specific for sentiment analysis tasks. Then, tweets are converted in the corresponding vector representation and given as input to the neural network with the aim of learning the different semantics contained in each emotion taken into account by the SemEval task. The output layer has been adapted based on the characteristics of each subtask. Preliminary results obtained on the provided training set are encouraging for pursuing the investigation into this direction.

pdf abs
NEUROSENT-PDI at SemEval-2018 Task 7: Discovering Textual Relations With a Neural Network Model
Mauro Dragoni
Proceedings of the 12th International Workshop on Semantic Evaluation

Discovering semantic relations within textual documents is a timely topic worthy of investigation. Natural language processing strategies are generally used for linking chunks of text in order to extract information that can be exploited by semantic search engines for performing complex queries. The scientific domain is an interesting area where these techniques can be applied. In this paper, we describe a system based on neural networks applied to the SemEval 2018 Task 7. The system relies on the use of word embeddings for composing the vectorial representation of text chunks. Such representations are used for feeding a neural network aims to learn the structure of paths connecting chunks associated with a specific relation. Preliminary results demonstrated the suitability of the proposed approach encouraging the investigation of this research direction.

2016

pdf abs
DRANZIERA: An Evaluation Protocol For Multi-Domain Opinion Mining
Mauro Dragoni | Andrea Tettamanzi | Célia da Costa Pereira
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Opinion Mining is a topic which attracted a lot of interest in the last years. By observing the literature, it is often hard to replicate system evaluation due to the unavailability of the data used for the evaluation or to the lack of details about the protocol used in the campaign. In this paper, we propose an evaluation protocol, called DRANZIERA, composed of a multi-domain dataset and guidelines allowing both to evaluate opinion mining systems in different contexts (Closed, Semi-Open, and Open) and to compare them to each other and to a number of baselines.

2015

pdf
SHELLFBK: An Information Retrieval-based System For Multi-Domain Sentiment Analysis
Mauro Dragoni
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf abs
Modeling, Managing, Exposing, and Linking Ontologies with a Wiki-based Tool
Mauro Dragoni | Alessio Bosca | Matteo Casu | Andi Rexha
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In the last decade, the need of having effective and useful tools for the creation and the management of linguistic resources significantly increased. One of the main reasons is the necessity of building linguistic resources (LRs) that, besides the goal of expressing effectively the domain that users want to model, may be exploited in several ways. In this paper we present a wiki-based collaborative tool for modeling ontologies, and more in general any kind of linguistic resources, called MoKi. This tool has been customized in the context of an EU-funded project for addressing three important aspects of LRs modeling: (i) the exposure of the created LRs, (ii) for providing features for linking the created resources to external ones, and (iii) for producing multilingual LRs in a safe manner.