Marco Guerini


Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study
Serra Sinem Tekiroğlu | Helena Bonaldi | Margherita Fanton | Marco Guerini
Findings of the Association for Computational Linguistics: ACL 2022

In this work, we present an extensive study on the use of pre-trained language models for the task of automatic Counter Narrative (CN) generation to fight online hate speech in English. We first present a comparative study to determine whether there is a particular Language Model (or class of LMs) and a particular decoding mechanism that are the most appropriate to generate CNs. Findings show that autoregressive models combined with stochastic decodings are the most promising. We then investigate how an LM performs in generating a CN with regard to an unseen target of hate. We find out that a key element for successful ‘out of target’ experiments is not an overall similarity with the training data but the presence of a specific subset of training data, i. e. a target that shares some commonalities with the test target that can be defined a-priori. We finally introduce the idea of a pipeline based on the addition of an automatic post-editing step to refine generated CNs.

Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering
Helena Bonaldi | Sara Dellantonio | Serra Sinem Tekiroğlu | Marco Guerini
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Fighting online hate speech is a challenge that is usually addressed using Natural Language Processing via automatic detection and removal of hate content. Besides this approach, counter narratives have emerged as an effective tool employed by NGOs to respond to online hate on social media platforms. For this reason, Natural Language Generation is currently being studied as a way to automatize counter narrative writing. However, the existing resources necessary to train NLG models are limited to 2-turn interactions (a hate speech and a counter narrative as response), while in real life, interactions can consist of multiple turns. In this paper, we present a hybrid approach for dialogical data collection, which combines the intervention of human expert annotators over machine generated dialogues obtained using 19 different configurations. The result of this work is DIALOCONAN, the first dataset comprising over 3000 fictitious multi-turn dialogues between a hater and an NGO operator, covering 6 targets of hate.


Multilingual Counter Narrative Type Classification
Yi-Ling Chung | Marco Guerini | Rodrigo Agerri
Proceedings of the 8th Workshop on Argument Mining

The growing interest in employing counter narratives for hatred intervention brings with it a focus on dataset creation and automation strategies. In this scenario, learning to recognize counter narrative types from natural text is expected to be useful for applications such as hate speech countering, where operators from non-governmental organizations are supposed to answer to hate with several and diverse arguments that can be mined from online sources. This paper presents the first multilingual work on counter narrative type classification, evaluating SoTA pre-trained language models in monolingual, multilingual and cross-lingual settings. When considering a fine-grained annotation of counter narrative classes, we report strong baseline classification results for the majority of the counter narrative types, especially if we translate every language to English before cross-lingual prediction. This suggests that knowledge about counter narratives can be successfully transferred across languages.

Human-in-the-Loop for Data Collection: a Multi-Target Counter Narrative Dataset to Fight Online Hate Speech
Margherita Fanton | Helena Bonaldi | Serra Sinem Tekiroğlu | Marco Guerini
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Undermining the impact of hateful content with informed and non-aggressive responses, called counter narratives, has emerged as a possible solution for having healthier online communities. Thus, some NLP studies have started addressing the task of counter narrative generation. Although such studies have made an effort to build hate speech / counter narrative (HS/CN) datasets for neural generation, they fall short in reaching either high-quality and/or high-quantity. In this paper, we propose a novel human-in-the-loop data collection methodology in which a generative language model is refined iteratively by using its own data from the previous loops to generate new training samples that experts review and/or post-edit. Our experiments comprised several loops including diverse dynamic variations. Results show that the methodology is scalable and facilitates diverse, novel, and cost-effective data collection. To our knowledge, the resulting dataset is the only expert-based multi-target HS/CN dataset available to the community.

Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech
Yi-Ling Chung | Serra Sinem Tekiroğlu | Marco Guerini
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators’ Disagreement
Elisa Leonardelli | Stefano Menini | Alessio Palmero Aprosio | Marco Guerini | Sara Tonelli
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Since state-of-the-art approaches to offensive language detection rely on supervised learning, it is crucial to quickly adapt them to the continuously evolving scenario of social media. While several approaches have been proposed to tackle the problem from an algorithmic perspective, so to reduce the need for annotated data, less attention has been paid to the quality of these data. Following a trend that has emerged recently, we focus on the level of agreement among annotators while selecting data to create offensive language datasets, a task involving a high level of subjectivity. Our study comprises the creation of three novel datasets of English tweets covering different topics and having five crowd-sourced judgments each. We also present an extensive set of experiments showing that selecting training and test data according to different levels of annotators’ agreement has a strong effect on classifiers performance and robustness. Our findings are further validated in cross-domain experiments and studied using a popular benchmark dataset. We show that such hard cases, where low agreement is present, are not necessarily due to poor-quality annotation and we advocate for a higher presence of ambiguous cases in future datasets, in order to train more robust systems and better account for the different points of view expressed online.


Regrexit or not Regrexit: Aspect-based Sentiment Analysis in Polarized Contexts
Vorakit Vorakitphan | Marco Guerini | Elena Cabrio | Serena Villata
Proceedings of the 28th International Conference on Computational Linguistics

Emotion analysis in polarized contexts represents a challenge for Natural Language Processing modeling. As a step in the aforementioned direction, we present a methodology to extend the task of Aspect-based Sentiment Analysis (ABSA) toward the affect and emotion representation in polarized settings. In particular, we adopt the three-dimensional model of affect based on Valence, Arousal, and Dominance (VAD). We then present a Brexit scenario that proves how affect varies toward the same aspect when politically polarized stances are presented. Our approach captures aspect-based polarization from newspapers regarding the Brexit scenario of 1.2m entities at sentence-level. We demonstrate how basic constituents of emotions can be mapped to the VAD model, along with their interactions respecting the polarized context in ABSA settings using biased key-concepts (e.g., “stop Brexit” vs. “support Brexit”). Quite intriguingly, the framework achieves to produce coherent aspect evidences of Brexit’s stance from key-concepts, showing that VAD influence the support and opposition aspects.

Generating Counter Narratives against Online Hate Speech: Data and Strategies
Serra Sinem Tekiroğlu | Yi-Ling Chung | Marco Guerini
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Recently research has started focusing on avoiding undesired effects that come with content moderation, such as censorship and overblocking, when dealing with hatred online. The core idea is to directly intervene in the discussion with textual responses that are meant to counter the hate content and prevent it from further spreading. Accordingly, automation strategies, such as natural language generation, are beginning to be investigated. Still, they suffer from the lack of sufficient amount of quality data and tend to produce generic/repetitive responses. Being aware of the aforementioned limitations, we present a study on how to collect responses to hate effectively, employing large scale unsupervised language models such as GPT-2 for the generation of silver data, and the best annotation strategies/neural architectures that can be used for data filtering before expert validation/post-editing.

Toward Stance-based Personas for Opinionated Dialogues
Thomas Scialom | Serra Sinem Tekiroğlu | Jacopo Staiano | Marco Guerini
Findings of the Association for Computational Linguistics: EMNLP 2020

In the context of chit-chat dialogues it has been shown that endowing systems with a persona profile is important to produce more coherent and meaningful conversations. Still, the representation of such personas has thus far been limited to a fact-based representation (e.g. “I have two cats.”). We argue that these representations remain superficial w.r.t. the complexity of human personality. In this work, we propose to make a step forward and investigate stance-based persona, trying to grasp more profound characteristics, such as opinions, values, and beliefs to drive language generation. To this end, we introduce a novel dataset allowing to explore different stance-based persona representations and their impact on claim generation, showing that they are able to grasp abstract and profound aspects of the author persona.


CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech
Yi-Ling Chung | Elizaveta Kuzmenko | Serra Sinem Tekiroglu | Marco Guerini
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Although there is an unprecedented effort to provide adequate responses in terms of laws and policies to hate content on social media platforms, dealing with hatred online is still a tough problem. Tackling hate speech in the standard way of content deletion or user suspension may be charged with censorship and overblocking. One alternate strategy, that has received little attention so far by the research community, is to actually oppose hate content with counter-narratives (i.e. informed textual responses). In this paper, we describe the creation of the first large-scale, multilingual, expert-based dataset of hate-speech/counter-narrative pairs. This dataset has been built with the effort of more than 100 operators from three different NGOs that applied their training and expertise to the task. Together with the collected data we also provide additional annotations about expert demographics, hate and response type, and data augmentation through translation and paraphrasing. Finally, we provide initial experiments to assess the quality of our data.

FASTDial: Abstracting Dialogue Policies for Fast Development of Task Oriented Agents
Serra Sinem Tekiroglu | Bernardo Magnini | Marco Guerini
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present a novel abstraction framework called FASTDial for designing task oriented dialogue agents, built on top of the OpenDial toolkit. This framework is meant to facilitate prototyping and development of dialogue systems from scratch also by non tech savvy especially when limited training data is available. To this end, we use a generic and simple frame-slots data-structure with pre-defined dialogue policies that allows for fast design and implementation at the price of some flexibility reduction. Moreover, it allows for minimizing programming effort and domain expert training time, by hiding away many implementation details. We provide a system demonstration screencast video in the following link:

How to Use Gazetteers for Entity Recognition with Neural Models
Simone Magnolini | Valerio Piccioni | Vevake Balaraman | Marco Guerini | Bernardo Magnini
Proceedings of the 5th Workshop on Semantic Deep Learning (SemDeep-5)

Generating Challenge Datasets for Task-Oriented Conversational Agents through Self-Play
Sourabh Majumdar | Serra Sinem Tekiroglu | Marco Guerini
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

End-to-end neural approaches are becoming increasingly common in conversational scenarios due to their promising performances when provided with sufficient amount of data. In this paper, we present a novel methodology to address the interpretability of neural approaches in such scenarios by creating challenge datasets using dialogue self-play over multiple tasks/intents. Dialogue self-play allows generating large amount of synthetic data; by taking advantage of the complete control over the generation process, we show how neural approaches can be evaluated in terms of unseen dialogue patterns. We propose several out-of-pattern test cases each of which introduces a natural and unexpected user utterance phenomenon. As a proof of concept, we built a single and a multiple memory network, and show that these two architectures have diverse performances depending on the peculiar dialogue patterns.


Toward zero-shot Entity Recognition in Task-oriented Conversational Agents
Marco Guerini | Simone Magnolini | Vevake Balaraman | Bernardo Magnini
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

We present a domain portable zero-shot learning approach for entity recognition in task-oriented conversational agents, which does not assume any annotated sentences at training time. Rather, we derive a neural model of the entity names based only on available gazetteers, and then apply the model to recognize new entities in the context of user utterances. In order to evaluate our working hypothesis we focus on nominal entities that are largely used in e-commerce to name products. Through a set of experiments in two languages (English and Italian) and three different domains (furniture, food, clothing), we show that the neural gazetteer-based approach outperforms several competitive baselines, with minimal requirements of linguistic features.

A Methodology for Evaluating Interaction Strategies of Task-Oriented Conversational Agents
Marco Guerini | Sara Falcone | Bernardo Magnini
Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI

In task-oriented conversational agents, more attention has been usually devoted to assessing task effectiveness, rather than to how the task is achieved. However, conversational agents are moving towards more complex and human-like interaction capabilities (e.g. the ability to use a formal/informal register, to show an empathetic behavior), for which standard evaluation methodologies may not suffice. In this paper, we provide a novel methodology to assess - in a completely controlled way - the impact on the quality of experience of agent’s interaction strategies. The methodology is based on a within subject design, where two slightly different transcripts of the same interaction with a conversational agent are presented to the user. Through a series of pilot experiments we prove that this methodology allows fast and cheap experimentation/evaluation, focusing on aspects that are overlooked by current methods.

Generating E-Commerce Product Titles and Predicting their Quality
José G. Camargo de Souza | Michael Kozielski | Prashant Mathur | Ernie Chang | Marco Guerini | Matteo Negri | Marco Turchi | Evgeny Matusov
Proceedings of the 11th International Conference on Natural Language Generation

E-commerce platforms present products using titles that summarize product information. These titles cannot be created by hand, therefore an algorithmic solution is required. The task of automatically generating these titles given noisy user provided titles is one way to achieve the goal. The setting requires the generation process to be fast and the generated title to be both human-readable and concise. Furthermore, we need to understand if such generated titles are usable. As such, we propose approaches that (i) automatically generate product titles, (ii) predict their quality. Our approach scales to millions of products and both automatic and human evaluations performed on real-world data indicate our approaches are effective and applicable to existing e-commerce scenarios.


Fortia-FBK at SemEval-2017 Task 5: Bullish or Bearish? Inferring Sentiment towards Brands from Financial News Headlines
Youness Mansar | Lorenzo Gatti | Sira Ferradans | Marco Guerini | Jacopo Staiano
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this paper, we describe a methodology to infer Bullish or Bearish sentiment towards companies/brands. More specifically, our approach leverages affective lexica and word embeddings in combination with convolutional neural networks to infer the sentiment of financial news headlines towards a target company. Such architecture was used and evaluated in the context of the SemEval 2017 challenge (task 5, subtask 2), in which it obtained the best performance.


Echoes of Persuasion: The Effect of Euphony in Persuasive Communication
Marco Guerini | Gözde Özbal | Carlo Strapparava
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies


Depeche Mood: a Lexicon for Emotion Analysis from Crowd Annotated News
Jacopo Staiano | Marco Guerini
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Creative language explorations through a high-expressivity N-grams query language
Carlo Strapparava | Lorenzo Gatti | Marco Guerini | Oliviero Stock
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In computation linguistics a combination of syntagmatic and paradigmatic features is often exploited. While the first aspects are typically managed by information present in large n-gram databases, domain and ontological aspects are more properly modeled by lexical ontologies such as WordNet and semantic similarity spaces. This interconnection is even stricter when we are dealing with creative language phenomena, such as metaphors, prototypical properties, puns generation, hyperbolae and other rhetorical phenomena. This paper describes a way to focus on and accomplish some of these tasks by exploiting NgramQuery, a generalized query language on Google N-gram database. The expressiveness of this query language is boosted by plugging semantic similarity acquired both from corpora (e.g. LSA) and from WordNet, also integrating operators for phonetics and sentiment analysis. The paper reports a number of examples of usage in some creative language tasks.


Sentiment Analysis: How to Derive Prior Polarities from SentiWordNet
Marco Guerini | Lorenzo Gatti | Marco Turchi
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

FBK: Sentiment Analysis in Twitter with Tweetsted
Md. Faisal Mahbub Chowdhury | Marco Guerini | Sara Tonelli | Alberto Lavelli
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)


Ecological Evaluation of Persuasive Messages Using Google AdWords
Marco Guerini | Carlo Strapparava | Oliviero Stock
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Assessing Sentiment Strength in Words Prior Polarities
Lorenzo Gatti | Marco Guerini
Proceedings of COLING 2012: Posters

Brand Pitt: A Corpus to Explore the Art of Naming
Gözde Özbal | Carlo Strapparava | Marco Guerini
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The name of a company or a brand is the key element to a successful business. A good name is able to state the area of competition and communicate the promise given to customers by evoking semantic associations. Although various resources provide distinct tips for inventing creative names, little research was carried out to investigate the linguistic aspects behind the naming mechanism. Besides, there might be latent methods that copywriters unconsciously use. In this paper, we describe the annotation task that we have conducted on a dataset of creative names collected from various resources to create a gold standard for linguistic creativity in naming. Based on the annotations, we compile common and latent methods of naming and explore the correlations among linguistic devices, provoked effects and business domains. This resource represents a starting point for a corpus based approach to explore the art of naming.


Evaluation Metrics for Persuasive NLP with Google AdWords
Marco Guerini | Carlo Strapparava | Oliviero Stock
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Evaluating systems and theories about persuasion represents a bottleneck for both theoretical and applied fields: experiments are usually expensive and time consuming. Still, measuring the persuasive impact of a message is of paramount importance. In this paper we present a new ``cheap and fast'' methodology for measuring the persuasiveness of communication. This methodology allows conducting experiments with thousands of subjects for a few dollars in a few hours, by tweaking and using existing commercial tools for advertising on the web, such as Google AdWords. The central idea is to use AdWords features for defining message persuasiveness metrics. Along with a description of our approach we provide some pilot experiments, conducted both with text and image based ads, that confirm the effectiveness of our ideas. We also discuss the possible application of research on persuasive systems to Google AdWords in order to add more flexibility in the wearing out of persuasive messages.

Predicting Persuasiveness in Political Discourses
Carlo Strapparava | Marco Guerini | Oliviero Stock
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In political speeches, the audience tends to react or resonate to signals of persuasive communication, including an expected theme, a name or an expression. Automatically predicting the impact of such discourses is a challenging task. In fact nowadays, with the huge amount of textual material that flows on the Web (news, discourses, blogs, etc.), it can be useful to have a measure for testing the persuasiveness of what we retrieve or possibly of what we want to publish on Web. In this paper we exploit a corpus of political discourses collected from various Web sources, tagged with audience reactions, such as applause, as indicators of persuasive expressions. In particular, we use this data set in a machine learning framework to explore the possibility of classifying the transcript of political discourses, according to their persuasive power, predicting the sentences that possibly trigger applause. We also explore differences between Democratic and Republican speeches, experiment the resulting classifiers in grading some of the discourses in the Obama-McCain presidential campaign available on the Web.


Resources for Persuasion
Marco Guerini | Carlo Strapparava | Oliviero Stock
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper presents resources and strategies for persuasive natural language processing. After the introduction of a specifically tagged corpus, some techniques for affective language processing and for persuasive lexicon extraction are provided together with prospective scenarios of application.

Valentino: A Tool for Valence Shifting of Natural Language Texts
Marco Guerini | Carlo Strapparava | Oliviero Stock
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper a first implementation of a tool for valence shifting of natural language texts, named Valentino (VALENced Text INOculator), is presented. Valentino can modify existing textual expressions towards more positively or negatively valenced versions. To this end we built specific resources gathering various valenced terms that are semantically or contextually connected, and implemented strategies that uses these resources for substituting input terms.