Mauro Dragoni

2025

pdf bib abs
Neutral Is Not Unbiased: Evaluating Implicit and Intersectional Identity Bias in LLMs Through Structured Narrative Scenarios
Saba Ghanbari Haez | Mauro Dragoni
Findings of the Association for Computational Linguistics: EMNLP 2025

Large Language Models often reproduce societal biases, yet most evaluations overlook how such biases evolve across nuanced contexts or intersecting identities. We introduce a scenario-based evaluation framework built on 100 narrative tasks, designed to be neutral at baseline and systematically modified with gender and age cues. Grounded in the theory of Normative-Narrative Scenarios, our approach provides ethically coherent and socially plausible settings for probing model behavior. Analyzing responses from five leading LLMs—GPT-4o, LLaMA 3.1, Qwen2.5, Phi-4, and Mistral—using Critical Discourse Analysis and quantitative linguistic metrics, we find consistent evidence of bias. Gender emerges as the dominant axis of bias, with intersectional cues (e.g., age and gender combined) further intensifying disparities. Our results underscore the value of dynamic narrative progression for detecting implicit, systemic biases in Large Language Models.

2024

pdf bib abs
Building Certified Medical Chatbots: Overcoming Unstructured Data Limitations with Modular RAG
Leonardo Sanna | Patrizio Bellan | Simone Magnolini | Marina Segala | Saba Ghanbari Haez | Monica Consolandi | Mauro Dragoni
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024

Creating a certified conversational agent poses several issues. The need to manage fine-grained information delivery and the necessity to provide reliable medical information requires a notable effort, especially in dataset preparation. In this paper, we investigate the challenges of building a certified medical chatbot in Italian that provides information about pregnancy and early childhood. We show some negative initial results regarding the possibility of creating a certified conversational agent within the RASA framework starting from unstructured data. Finally, we propose a modular RAG model to implement a Large Language Model in a certified context, overcoming data limitations and enabling data collection on actual conversations.

2020

pdf bib abs
MTSI-BERT: A Session-aware Knowledge-based Conversational Agent
Matteo Antonio Senese | Giuseppe Rizzo | Mauro Dragoni | Maurizio Morisio
Proceedings of the Twelfth Language Resources and Evaluation Conference

In the last years, the state of the art of NLP research has made a huge step forward. Since the release of ELMo (Peters et al., 2018), a new race for the leading scoreboards of all the main linguistic tasks has begun. Several models have been published achieving promising results in all the major NLP applications, from question answering to text classification, passing through named entity recognition. These great research discoveries coincide with an increasing trend for voice-based technologies in the customer care market. One of the next biggest challenges in this scenario will be the handling of multi-turn conversations, a type of conversations that differs from single-turn by the presence of multiple related interactions. The proposed work is an attempt to exploit one of these new milestones to handle multi-turn conversations. MTSI-BERT is a BERT-based model achieving promising results in intent classification, knowledge base action prediction and end of dialogue session detection, to determine the right moment to fulfill the user request. The study about the realization of PuffBot, an intelligent chatbot to support and monitor people suffering from asthma, shows how this type of technique could be an important piece in the development of future chatbots.

2018

pdf bib abs
NEUROSENT-PDI at SemEval-2018 Task 1: Leveraging a Multi-Domain Sentiment Model for Inferring Polarity in Micro-blog Text
Mauro Dragoni
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describes the NeuroSent system that participated in SemEval 2018 Task 1. Our system takes a supervised approach that builds on neural networks and word embeddings. Word embeddings were built by starting from a repository of user generated reviews. Thus, they are specific for sentiment analysis tasks. Then, tweets are converted in the corresponding vector representation and given as input to the neural network with the aim of learning the different semantics contained in each emotion taken into account by the SemEval task. The output layer has been adapted based on the characteristics of each subtask. Preliminary results obtained on the provided training set are encouraging for pursuing the investigation into this direction.

pdf bib abs
NEUROSENT-PDI at SemEval-2018 Task 3: Understanding Irony in Social Networks Through a Multi-Domain Sentiment Model
Mauro Dragoni
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describes the NeuroSent system that participated in SemEval 2018 Task 3. Our system takes a supervised approach that builds on neural networks and word embeddings. Word embeddings were built by starting from a repository of user generated reviews. Thus, they are specific for sentiment analysis tasks. Then, tweets are converted in the corresponding vector representation and given as input to the neural network with the aim of learning the different semantics contained in each emotion taken into account by the SemEval task. The output layer has been adapted based on the characteristics of each subtask. Preliminary results obtained on the provided training set are encouraging for pursuing the investigation into this direction.

pdf bib abs
NEUROSENT-PDI at SemEval-2018 Task 7: Discovering Textual Relations With a Neural Network Model
Mauro Dragoni
Proceedings of the 12th International Workshop on Semantic Evaluation

Discovering semantic relations within textual documents is a timely topic worthy of investigation. Natural language processing strategies are generally used for linking chunks of text in order to extract information that can be exploited by semantic search engines for performing complex queries. The scientific domain is an interesting area where these techniques can be applied. In this paper, we describe a system based on neural networks applied to the SemEval 2018 Task 7. The system relies on the use of word embeddings for composing the vectorial representation of text chunks. Such representations are used for feeding a neural network aims to learn the structure of paths connecting chunks associated with a specific relation. Preliminary results demonstrated the suitability of the proposed approach encouraging the investigation of this research direction.

2016

pdf bib abs
DRANZIERA: An Evaluation Protocol For Multi-Domain Opinion Mining
Mauro Dragoni | Andrea Tettamanzi | Célia da Costa Pereira
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Opinion Mining is a topic which attracted a lot of interest in the last years. By observing the literature, it is often hard to replicate system evaluation due to the unavailability of the data used for the evaluation or to the lack of details about the protocol used in the campaign. In this paper, we propose an evaluation protocol, called DRANZIERA, composed of a multi-domain dataset and guidelines allowing both to evaluate opinion mining systems in different contexts (Closed, Semi-Open, and Open) and to compare them to each other and to a number of baselines.

2015

pdf bib
SHELLFBK: An Information Retrieval-based System For Multi-Domain Sentiment Analysis
Mauro Dragoni
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib abs
Modeling, Managing, Exposing, and Linking Ontologies with a Wiki-based Tool
Mauro Dragoni | Alessio Bosca | Matteo Casu | Andi Rexha
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In the last decade, the need of having effective and useful tools for the creation and the management of linguistic resources significantly increased. One of the main reasons is the necessity of building linguistic resources (LRs) that, besides the goal of expressing effectively the domain that users want to model, may be exploited in several ways. In this paper we present a wiki-based collaborative tool for modeling ontologies, and more in general any kind of linguistic resources, called MoKi. This tool has been customized in the context of an EU-funded project for addressing three important aspects of LRs modeling: (i) the exposure of the created LRs, (ii) for providing features for linking the created resources to external ones, and (iii) for producing multilingual LRs in a safe manner.