Simone Magnolini

2024

Building Certified Medical Chatbots: Overcoming Unstructured Data Limitations with Modular RAG
Leonardo Sanna | Patrizio Bellan | Simone Magnolini | Marina Segala | Saba Ghanbari Haez | Monica Consolandi | Mauro Dragoni
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024

Creating a certified conversational agent poses several issues. The need to manage fine-grained information delivery and the necessity to provide reliable medical information requires a notable effort, especially in dataset preparation. In this paper, we investigate the challenges of building a certified medical chatbot in Italian that provides information about pregnancy and early childhood. We show some negative initial results regarding the possibility of creating a certified conversational agent within the RASA framework starting from unstructured data. Finally, we propose a modular RAG model to implement a Large Language Model in a certified context, overcoming data limitations and enabling data collection on actual conversations.

2020

pdf bib abs

Comparing Machine Learning and Deep Learning Approaches on NLP Tasks for the Italian Language
Bernardo Magnini | Alberto Lavelli | Simone Magnolini
Proceedings of the Twelfth Language Resources and Evaluation Conference

We present a comparison between deep learning and traditional machine learning methods for various NLP tasks in Italian. We carried on experiments using available datasets (e.g., from the Evalita shared tasks) on two sequence tagging tasks (i.e., named entities recognition and nominal entities recognition) and four classification tasks (i.e., lexical relations among words, semantic relations among sentences, sentiment analysis and text classification). We show that deep learning approaches outperform traditional machine learning algorithms in sequence tagging, while for classification tasks that heavily rely on semantics approaches based on feature engineering are still competitive. We think that a similar analysis could be carried out for other languages to provide an assessment of machine learning / deep learning models across different languages.

2019

pdf bib

How to Use Gazetteers for Entity Recognition with Neural Models
Simone Magnolini | Valerio Piccioni | Vevake Balaraman | Marco Guerini | Bernardo Magnini
Proceedings of the 5th Workshop on Semantic Deep Learning (SemDeep-5)

2018

pdf bib

What’s in a Food Name: Knowledge Induction from Gazetteers of Food Main Ingredient
Bernardo Magnini | Vevake Balaraman | Simone Magnolini | Marco Guerini
Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018)

pdf bib abs

Toward zero-shot Entity Recognition in Task-oriented Conversational Agents
Marco Guerini | Simone Magnolini | Vevake Balaraman | Bernardo Magnini
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

We present a domain portable zero-shot learning approach for entity recognition in task-oriented conversational agents, which does not assume any annotated sentences at training time. Rather, we derive a neural model of the entity names based only on available gazetteers, and then apply the model to recognize new entities in the context of user utterances. In order to evaluate our working hypothesis we focus on nominal entities that are largely used in e-commerce to name products. Through a set of experiments in two languages (English and Italian) and three different domains (furniture, food, clothing), we show that the neural gazetteer-based approach outperforms several competitive baselines, with minimal requirements of linguistic features.

2016

pdf bib abs

Using WordNet to Build Lexical Sets for Italian Verbs
Anna Feltracco | Lorenzo Gatti | Elisabetta Jezek | Bernardo Magnini | Simone Magnolini
Proceedings of the 8th Global WordNet Conference (GWC)

We present a methodology for building lexical sets for argument slots of Italian verbs. We start from an inventory of semantically typed Italian verb frames and through a mapping to WordNet we automatically annotate the sets of fillers for the argument positions in a corpus of sentences. We evaluate both a baseline algorithm and a syntax driven algorithm and show that the latter performs significantly better in terms of precision.

pdf bib abs

Acquiring Opposition Relations among Italian Verb Senses using Crowdsourcing
Anna Feltracco | Simone Magnolini | Elisabetta Jezek | Bernardo Magnini
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We describe an experiment for the acquisition of opposition relations among Italian verb senses, based on a crowdsourcing methodology. The goal of the experiment is to discuss whether the types of opposition we distinguish (i.e. complementarity, antonymy, converseness and reversiveness) are actually perceived by the crowd. In particular, we collect data for Italian by using the crowdsourcing platform CrowdFlower. We ask annotators to judge the type of opposition existing among pairs of sentences -previously judged as opposite- that differ only for a verb: the verb in the first sentence is opposite of the verb in second sentence. Data corroborate the hypothesis that some opposition relations exclude each other, while others interact, being recognized as compatible by the contributors.

pdf bib

FBK-HLT-NLP at SemEval-2016 Task 2: A Multitask, Deep Learning Approach for Interpretable Semantic Textual Similarity
Simone Magnolini | Anna Feltracco | Bernardo Magnini
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)