Shabnam Tafreshi

2025

pdf bib
The Sixth Workshop on Insights from Negative Results in NLP
Aleksandr Drozd | João Sedoc | Shabnam Tafreshi | Arjun Akula | Raphael Shu
The Sixth Workshop on Insights from Negative Results in NLP

2024

pdf bib abs
LLM-Based Section Identifiers Excel on Open Source but Stumble in Real World Applications
Saranya Krishnamoorthy | Ayush Singh | Shabnam Tafreshi
Proceedings of the 6th Clinical Natural Language Processing Workshop

Electronic health records (EHR) even though a boon for healthcare practitioners, are grow- ing convoluted and longer every day. Sifting around these lengthy EHRs is taxing and be- comes a cumbersome part of physician-patient interaction. Several approaches have been pro- posed to help alleviate this prevalent issue ei- ther via summarization or sectioning, however, only a few approaches have truly been helpful in the past. With the rise of automated methods, machine learning (ML) has shown promise in solving the task of identifying relevant sections in EHR. However, most ML methods rely on labeled data which is difficult to get in health- care. Large language models (LLMs) on the other hand, have performed impressive feats in natural language processing (NLP), that too in a zero-shot manner, i.e. without any labeled data. To that end, we propose using LLMs to identify relevant section headers. We find that GPT-4 can effectively solve the task on both zero and few-shot settings as well as segment dramatically better than state-of-the-art meth- ods. Additionally, we also annotate a much harder real world dataset and find that GPT-4 struggles to perform well, alluding to further research and harder benchmarks.

pdf bib
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis
Orphée De Clercq | Valentin Barriere | Jeremy Barnes | Roman Klinger | João Sedoc | Shabnam Tafreshi
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

pdf bib abs
Findings of WASSA 2024 Shared Task on Empathy and Personality Detection in Interactions
Salvatore Giorgi | João Sedoc | Valentin Barriere | Shabnam Tafreshi
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

This paper presents the results of the WASSA 2024 shared task on predicting empathy, emotion, and personality in conversations and reactions to news articles. Participating teams were given access to a new, unpublished extension of the WASSA 2023 shared task dataset. This task is both multi-level and multi-modal: data is available at the person, essay, dialog, and dialog-turn levels and includes formal (news articles) and informal text (essays and dialogs), self-report data (personality and distress), and third-party annotations (empathy and emotion). The shared task included a new focus on conversations between humans and LLM-based virtual agents which occur immediately after reading and reacting to the news articles. Participants were encouraged to explore the multi-level and multi-modal nature of this data. Participation was encouraged in four tracks: (i) predicting the perceived empathy at the dialog level, (ii) predicting turn-level empathy, emotion polarity, and emotion intensity in conversations, (iii) predicting state empathy and distress scores, and (iv) predicting personality. In total, 14 teams participated in the shared task. We summarize the methods and resources used by the participating teams.

2023

This paper provides an overview of the first shared task on choosing beneficial instances for machine translation, conducted as part of the CoCo4MT 2023 Workshop at MTSummit. This shared task was motivated by the need to make the data annotation process for machine translation more efficient, particularly for low-resource languages for which collecting human translations may be difficult or expensive. The task involved developing methods for selecting the most beneficial instances for training a machine translation system without access to an existing parallel dataset in the target language, such that the best selected instances can then be manually translated. Two teams participated in the shared task, namely the Williams team and the AST team. Submissions were evaluated by training a machine translation model on each submission’s chosen instances, and comparing their performance with the chRF++ score. The system that ranked first is by the Williams team, that finds representative instances by clustering the training data.

pdf bib abs
Findings of WASSA 2023 Shared Task on Empathy, Emotion and Personality Detection in Conversation and Reactions to News Articles
Valentin Barriere | João Sedoc | Shabnam Tafreshi | Salvatore Giorgi
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

This paper presents the results of the WASSA 2023 shared task on predicting empathy, emotion, and personality in conversations and reactions to news articles. Participating teams were given access to a new dataset from Omitaomu et al. (2022) comprising empathic and emotional reactions to news articles. The dataset included formal and informal text, self-report data, and third-party annotations. Specifically, the dataset contained news articles (where harm is done to a person, group, or other) and crowd-sourced essays written in reaction to the article. After reacting via essays, crowd workers engaged in conversations about the news articles. Finally, the crowd workers self-reported their empathic concern and distress, personality (using the Big Five), and multi-dimensional empathy (via the Interpersonal Reactivity Index). A third-party annotated both the conversational turns (for empathy, emotion polarity, and emotion intensity) and essays (for multi-label emotions). Thus, the dataset contained outcomes (self-reported or third-party annotated) at the turn level (within conversations) and the essay level. Participation was encouraged in five tracks: (i) predicting turn-level empathy, emotion polarity, and emotion intensity in conversations, (ii) predicting state empathy and distress scores, (iii) predicting emotion categories, (iv) predicting personality, and (v) predicting multi-dimensional trait empathy. In total, 21 teams participated in the shared task. We summarize the methods and resources used by the participating teams.

2022

pdf bib abs
You’ve translated it, now what?
Michael Maxwell | Shabnam Tafreshi | Aquia Richburg | Balaji Kodali | Kymani Brown
Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track)

Humans use document formatting to discover document and section titles, and important phrases. But when machines process a paper–especially documents OCRed from images–these cues are often invisible to downstream processes: words in footnotes or body text are treated as just as important as words in titles. It would be better for indexing and summarization tools to be guided by implicit document structure. In an ODNI-sponsored project, ARLIS looked at discovering formatting in OCRed text as a way to infer document structure. Most OCR engines output results as hOCR (an XML format), giving bounding boxes around characters. In theory, this also provides style information such as bolding and italicization, but in practice, this capability is limited. For example, the Tesseract OCR tool provides bounding boxes, but does not attempt to detect bold text (relevant to author emphasis and specialized fields in e.g. print dictionaries), and its discrimination of italicization is poor. Our project inferred font size from hOCR bounding boxes, and using that and other cues (e.g. the fact that titles tend to be short) determined which text constituted section titles; from this, a document outline can be created. We also experimented with algorithms for detecting bold text. Our best algorithm has a much improved recall and precision, although the exact numbers are font-dependent. The next step is to incorporate inferred structure into the output of machine translation. One way is to embed XML tags for inferred structure into the text extracted from the imaged document, and to either pass the strings enclosed by XML tags to the MT engine individually, or pass the tags through the MT engine without modification. This structural information can guide downstream bulk processing tasks such as summarization and search, and also enables building tables of contents for human users examining individual documents.

pdf bib
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Workshop 2: Corpus Generation and Corpus Augmentation for Machine Translation)
John E. Ortega | Marine Carpuat | William Chen | Katharina Kann | Constantine Lignos | Maja Popovic | Shabnam Tafreshi
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Workshop 2: Corpus Generation and Corpus Augmentation for Machine Translation)

pdf bib
Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis
Jeremy Barnes | Orphée De Clercq | Valentin Barriere | Shabnam Tafreshi | Sawsan Alqahtani | João Sedoc | Roman Klinger | Alexandra Balahur
Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

pdf bib abs
WASSA 2022 Shared Task: Predicting Empathy, Emotion and Personality in Reaction to News Stories
Valentin Barriere | Shabnam Tafreshi | João Sedoc | Sawsan Alqahtani
Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

This paper presents the results that were obtained from WASSA 2022 shared task on predicting empathy, emotion, and personality in reaction to news stories. Participants were given access to a dataset comprising empathic reactions to news stories where harm is done to a person, group, or other. These reactions consist of essays and Batson’s empathic concern and personal distress scores. The dataset was further extended in WASSA 2021 shared task to include news articles, person-level demographic information (e.g. age, gender), personality information, and Ekman’s six basic emotions at essay level Participation was encouraged in four tracks: predicting empathy and distress scores, predicting emotion categories, predicting personality and predicting interpersonal reactivity. In total, 14 teams participated in the shared task. We summarize the methods and resources used by the participating teams.

2021

pdf bib
Proceedings of the Second Workshop on Insights from Negative Results in NLP
João Sedoc | Anna Rogers | Anna Rumshisky | Shabnam Tafreshi
Proceedings of the Second Workshop on Insights from Negative Results in NLP

pdf bib
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Orphee De Clercq | Alexandra Balahur | Joao Sedoc | Valentin Barriere | Shabnam Tafreshi | Sven Buechel | Veronique Hoste
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib abs
WASSA 2021 Shared Task: Predicting Empathy and Emotion in Reaction to News Stories
Shabnam Tafreshi | Orphee De Clercq | Valentin Barriere | Sven Buechel | João Sedoc | Alexandra Balahur
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

This paper presents the results that were obtained from the WASSA 2021 shared task on predicting empathy and emotions. The participants were given access to a dataset comprising empathic reactions to news stories where harm is done to a person, group, or other. These reactions consist of essays, Batson empathic concern, and personal distress scores, and the dataset was further extended with news articles, person-level demographic information (age, gender, ethnicity, income, education level), and personality information. Additionally, emotion labels, namely Ekman’s six basic emotions, were added to the essays at both the document and sentence level. Participation was encouraged in two tracks: predicting empathy and predicting emotion categories. In total five teams participated in the shared task. We summarize the methods and resources used by the participating teams.

2019

pdf bib abs
GWU NLP Lab at SemEval-2019 Task 3 : EmoContext: Effectiveness ofContextual Information in Models for Emotion Detection inSentence-level at Multi-genre Corpus
Shabnam Tafreshi | Mona Diab
Proceedings of the 13th International Workshop on Semantic Evaluation

In this paper we present an emotion classifier models that submitted to the SemEval-2019 Task 3 : EmoContext. Our approach is a Gated Recurrent Neural Network (GRU) model with attention layer is bootstrapped with contextual information and trained with a multigenre corpus, which is combination of several popular emotional data sets. We utilize different word embeddings to empirically select the most suited embedding to represent our features. Our aim is to build a robust emotion classifier that can generalize emotion detection, which is to learn emotion cues in a noisy training environment. To fulfill this aim we train our model with a multigenre emotion corpus, this way we leverage from having more training set. We achieved overall %56.05 f1-score and placed 144. Given our aim and noisy training environment, the results are anticipated.

2018

pdf bib abs
Emotion Detection and Classification in a Multigenre Corpus with Joint Multi-Task Deep Learning
Shabnam Tafreshi | Mona Diab
Proceedings of the 27th International Conference on Computational Linguistics

Detection and classification of emotion categories expressed by a sentence is a challenging task due to subjectivity of emotion. To date, most of the models are trained and evaluated on single genre and when used to predict emotion in different genre their performance drops by a large margin. To address the issue of robustness, we model the problem within a joint multi-task learning framework. We train this model with a multigenre emotion corpus to predict emotions across various genre. Each genre is represented as a separate task, we use soft parameter shared layers across the various tasks. our experimental results show that this model improves the results across the various genres, compared to a single genre training in the same neural net architecture.

pdf bib
Sentence and Clause Level Emotion Annotation, Detection, and Classification in a Multi-Genre Corpus
Shabnam Tafreshi | Mona Diab
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)