Ellen Riloff

Also published as: E. Riloff


2025

Distinguishing between rhetorical questions and informational questions is a challenging task, as many rhetorical questions have similar surface forms to informational questions. Existing datasets, however, do not contain many questions that can be rhetorical or informational in different contexts. We introduce Studying Rhetorically Ambiguous Questions (SRAQ), a new dataset explicitly constructed to support the study of such rhetorical ambiguity. The questions in SRAQ can be interpreted as either rhetorical or informational depending on the context. We evaluate the performance of state-of-the-art language models on this dataset and find that they struggle to recognize many rhetorical questions.

2024

During crisis situations, observations of other people’s behaviors often play an essential role in a person’s decision-making. For example, a person might evacuate before a hurricane only if everyone else in the neighborhood does so. Conversely, a person might stay if no one else is leaving. Such observations are called social cues. Social cues are important for understanding people’s response to crises, so recognizing them can help inform the decisions of government officials and emergency responders. In this paper, we propose the first NLP task to categorize social cues in social media posts during crisis situations. We introduce a manually annotated dataset of 6,000 tweets, labeled with respect to eight social cue categories. We also present experimental results of several classification models, which show that some types of social cues can be recognized reasonably well, but overall this task is challenging for NLP systems. We further present error analyses to identify specific types of mistakes and promising directions for future research on this task.
Humans frequently experience emotions. When emotions arise, they affect not only our mental state but can also change our physical state. For example, we often open our eyes wide when we are surprised, or clap our hands when we feel excited. Physical manifestations of emotions are referred to as embodied emotion in the psychology literature. From an NLP perspective, recognizing descriptions of physical movements or physiological responses associated with emotions is a type of implicit emotion recognition. Our work introduces a new task of recognizing expressions of embodied emotion in natural language. We create a dataset of sentences that contains 7,300 body part mentions with human annotations for embodied emotion. We develop a classification model for this task and present two methods to acquire weakly labeled instances of embodied emotion by extracting emotional manner expressions and by prompting a language model. Our experiments show that the weakly labeled data can train an effective classification model without gold data, and can also improve performance when combined with gold data. Our dataset is publicly available at https://github.com/yyzhuang1991/Embodied-Emotions.

2023

Prior research on affective event classification showed that exploiting weakly labeled data for training can improve model performance. In this work, we propose a simpler and more effective approach for generating training data by automatically acquiring and labeling affective events with Multiple View Co-prompting, which leverages two language model prompts that provide independent views of an event. The approach starts with a modest amount of gold data and prompts pre-trained language models to generate new events. Next, information about the probable affective polarity of each event is collected from two complementary language model prompts and jointly used to assign polarity labels. Experimental results on two datasets show that the newly acquired events improve a state-of-the-art affective event classifier. We also present analyses which show that using multiple views produces polarity labels of higher quality than either view on its own.
Situation recognition is the task of recognizing the activity depictedin an image, including the people and objects involved. Previousmodels for this task typically train a classifier to identify theactivity using a backbone image feature extractor. We propose thatcommonsense knowledge about the objects depicted in an image can alsobe a valuable source of information for activity identification. Previous NLP research has argued that knowledge about the prototypicalfunctions of physical objects is important for language understanding,and NLP techniques have been developed to acquire this knowledge. Our work investigates whether this prototypical function knowledgecan also be beneficial for visual situation recognition. Webuild a framework that incorporates this type of commonsense knowledgein a transformer-based model that is trained to predict the actionverb for situation recognition. Our experimental results show thatadding prototypical function knowledge about physical objects doesimprove performance for the visual activity recognition task.

2022

Commonsense knowledge about the typicalfunctions of physical objects allows people tomake inferences during sentence understanding.For example, we infer that “Sam enjoyedthe book” means that Sam enjoyed reading thebook, even though the action is implicit. Priorresearch has focused on learning the prototypicalfunctions of physical objects in order toenable inferences about implicit actions. Butmany sentences refer to objects even when theyare not used (e.g., “The book fell”). We arguethat NLP systems need to recognize whether anobject is being used before inferring how theobject is used. We define a new task called ObjectUse Classification that determines whethera physical object mentioned in a sentence wasused or likely will be used. We introduce a newdataset for this task and present a classificationmodel that exploits data augmentation methodsand FrameNet when fine-tuning a pre-trainedlanguage model. We also show that object useclassification combined with knowledge aboutthe prototypical functions of objects has thepotential to yield very good inferences aboutimplicit and anticipated actions.
Relation extraction models typically cast the problem of determining whether there is a relation between a pair of entities as a single decision. However, these models can struggle with long or complex language constructions in which two entities are not directly linked, as is often the case in scientific publications. We propose a novel approach that decomposes a binary relation into two unary relations that capture each argument’s role in the relation separately. We create a stacked learning model that incorporates information from unary and binary relation extractors to determine whether a relation holds between two entities. We present experimental results showing that this approach outperforms several competitive relation extractors on a new corpus of planetary science publications as well as a benchmark dataset in the biology domain.

2021

Humans create things for a reason. Ancient people created spears for hunting, knives for cutting meat, pots for preparing food, etc. The prototypical function of a physical artifact is a kind of commonsense knowledge that we rely on to understand natural language. For example, if someone says “She borrowed the book” then you would assume that she intends to read the book, or if someone asks “Can I use your knife?” then you would assume that they need to cut something. In this paper, we introduce a new NLP task of learning the prototypical uses for human-made physical objects. We use frames from FrameNet to represent a set of common functions for objects, and describe a manually annotated data set of physical objects labeled with their prototypical function. We also present experimental results for this task, including BERT-based models that use predictions from masked patterns as well as artifact sense definitions from WordNet and frame definitions from FrameNet.
Frame identification is one of the key challenges for frame-semantic parsing. The goal of this task is to determine which frame best captures the meaning of a target word or phrase in a sentence. We present a new model for frame identification that uses a pre-trained transformer model to generate representations for frames and lexical units (senses) using their formal definitions in FrameNet. Our frame identification model assesses the suitability of a frame for a target word in a sentence based on the semantic coherence of their meanings. We evaluate our model on three data sets and show that it consistently achieves better performance than previous systems.

2020

Social media posts often contain questions, but many of the questions are rhetorical and do not seek information. Our work studies the problem of distinguishing rhetorical and information-seeking questions on Twitter. Most work has focused on features of the question itself, but we hypothesize that the prior context plays a role too. This paper introduces a new dataset containing questions in tweets paired with their prior tweets to provide context. We create classification models to assess the difficulty of distinguishing rhetorical and information-seeking questions, and experiment with different properties of the prior context. Our results show that the prior tweet and topic features can improve performance on this task.
Prior research has recognized the need to associate affective polarities with events and has produced several techniques and lexical resources for identifying affective events. Our research introduces new classification models to assign affective polarity to event phrases. First, we present a BERT-based model for affective event classification and show that the classifier achieves substantially better performance than a large affective event knowledge base. Second, we present a discourse-enhanced self-training method that iteratively improves the classifier with unlabeled data. The key idea is to exploit event phrases that occur with a coreferent sentiment expression. The discourse-enhanced self-training algorithm iteratively labels new event phrases based on both the classifier’s predictions and the polarities of the event’s coreferent sentiment expressions. Our results show that discourse-enhanced self-training further improves both recall and precision for affective event classification.
This paper presents the first research aimed at recognizing euphemistic and dysphemistic phrases with natural language processing. Euphemisms soften references to topics that are sensitive, disagreeable, or taboo. Conversely, dysphemisms refer to sensitive topics in a harsh or rude way. For example, “passed away” and “departed” are euphemisms for death, while “croaked” and “six feet under” are dysphemisms for death. Our work explores the use of sentiment analysis to recognize euphemistic and dysphemistic language. First, we identify near-synonym phrases for three topics (firing, lying, and stealing) using a bootstrapping algorithm for semantic lexicon induction. Next, we classify phrases as euphemistic, dysphemistic, or neutral using lexical sentiment cues and contextual sentiment analysis. We introduce a new gold standard data set and present our experimental results for this task.

2019

Human Needs categories have been used to characterize the reason why an affective event is positive or negative. For example, “I got the flu” and “I got fired” are both negative (undesirable) events, but getting the flu is a Health problem while getting fired is a Financial problem. Previous work created learning models to assign events to Human Needs categories based on their words and contexts. In this paper, we introduce an intermediate step that assigns words to relevant semantic concepts. We create lightly supervised models that learn to label words with respect to 10 semantic concepts associated with Human Needs categories, and incorporate these labels as features for event categorization. Our results show that recognizing relevant semantic concepts improves both the recall and precision of Human Needs categorization for events.

2018

We often talk about events that impact us positively or negatively. For example “I got a job” is good news, but “I lost my job” is bad news. When we discuss an event, we not only understand its affective polarity but also the reason why the event is beneficial or detrimental. For example, getting or losing a job has affective polarity primarily because it impacts us financially. Our work aims to categorize affective events based upon human need categories that often explain people’s motivations and desires: PHYSIOLOGICAL, HEALTH, LEISURE, SOCIAL, FINANCIAL, COGNITION, and FREEDOM. We create classification models based on event expressions as well as models that use contexts surrounding event mentions. We also design a co-training model that learns from unlabeled data by simultaneously training event expression and event context classifiers in an iterative learning process. Our results show that co-training performs well, producing substantially better results than the individual classifiers.
People go to different places to engage in activities that reflect their goals. For example, people go to restaurants to eat, libraries to study, and churches to pray. We refer to an activity that represents a common reason why people typically go to a location as a prototypical goal activity (goal-act). Our research aims to learn goal-acts for specific locations using a text corpus and semi-supervised learning. First, we extract activities and locations that co-occur in goal-oriented syntactic patterns. Next, we create an activity profile matrix and apply a semi-supervised label propagation algorithm to iteratively revise the activity strengths for different locations using a small set of labeled data. We show that this approach outperforms several baseline methods when judged against goal-acts identified by human annotators.
Many events have a positive or negative impact on our lives (e.g., “I bought a house” is typically good news, but ”My house burned down” is bad news). Recognizing events that have affective polarity is essential for narrative text understanding, conversational dialogue, and applications such as summarization and sarcasm detection. We will discuss our recent work on identifying affective events and categorizing them based on the underlying reasons for their affective polarity. First, we will describe a weakly supervised learning method to induce a large set of affective events from a text corpus by optimizing for semantic consistency. Second, we will present models to classify affective events based on Human Need Categories, which often explain people’s motivations and desires. Our best results use a co-training model that consists of event expression and event context classifiers and exploits both labeled and unlabeled texts. We will conclude with a discussion of interesting directions for future work in this area.

2017

Effective models of social dialog must understand a broad range of rhetorical and figurative devices. Rhetorical questions (RQs) are a type of figurative language whose aim is to achieve a pragmatic goal, such as structuring an argument, being persuasive, emphasizing a point, or being ironic. While there are computational models for other forms of figurative language, rhetorical questions have received little attention to date. We expand a small dataset from previous work, presenting a corpus of 10,270 RQs from debate forums and Twitter that represent different discourse functions. We show that we can clearly distinguish between RQs and sincere questions (0.76 F1). We then show that RQs can be used both sarcastically and non-sarcastically, observing that non-sarcastic (other) uses of RQs are frequently argumentative in forums, and persuasive in tweets. We present experiments to distinguish between these uses of RQs using SVM and LSTM models that represent linguistic features and post-level context, achieving results as high as 0.76 F1 for “sarcastic” and 0.77 F1 for “other” in forums, and 0.83 F1 for both “sarcastic” and “other” in tweets. We supplement our quantitative experiments with an in-depth characterization of the linguistic variation in RQs.

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1995

1993

1992

1991

Search
Co-authors
Fix author