Alejandro Jaimes


Multi-View Source Ablation for Faithful Summarization
Shuyang Cao | Liang Ma | Di Lu | Robert L Logan Iv | Joel Tetreault | Alejandro Jaimes
Findings of the Association for Computational Linguistics: EACL 2023

In this paper, we present MuFaSSa (Multi-view Faithfulness Scoring via Source Ablation), a metric for evaluating faithfulness of abstractive summaries, and for guiding training of more faithful summarizers. For evaluation, MuFaSSa employs different strategies (e.g., masking entity mentions) to first remove information from the source document to form multiple ablated views. Then, the faithfulness level of each token in a generated summary is measured by the difference between the token generation probabilities when given the original document and the ablated document as inputs to trained summarizers. For training, MuFaSSa uses a novel word truncation objective that drops unfaithful tokens located by MuFaSSa in both the decoder input and output. Alignments with human-annotated faithfulness labels on AggreFact show that MuFaSSa is comparable to or better than existing metrics built on classifiers or QA models pre-trained on other tasks. In experiments on summarization with XSum and CNN/DailyMail, models trained with word truncation using MuFaSSa outperform competitive methods according to both automatic faithfulness metrics and human assessments.


XLTime: A Cross-Lingual Knowledge Transfer Framework for Temporal Expression Extraction
Yuwei Cao | William Groves | Tanay Kumar Saha | Joel Tetreault | Alejandro Jaimes | Hao Peng | Philip Yu
Findings of the Association for Computational Linguistics: NAACL 2022

Temporal Expression Extraction (TEE) is essential for understanding time in natural language. It has applications in Natural Language Processing (NLP) tasks such as question answering, information retrieval, and causal inference. To date, work in this area has mostly focused on English as there is a scarcity of labeled data for other languages. We propose XLTime, a novel framework for multilingual TEE. XLTime works on top of pre-trained language models and leverages multi-task learning to prompt cross-language knowledge transfer both from English and within the non-English languages. XLTime alleviates problems caused by a shortage of data in the target language. We apply XLTime with different language models and show that it outperforms the previous automatic SOTA methods on French, Spanish, Portuguese, and Basque, by large margins. XLTime also closes the gap considerably on the handcrafted HeidelTime method.

CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and Summarization
Hossein Rajaby Faghihi | Bashar Alhafni | Ke Zhang | Shihao Ran | Joel Tetreault | Alejandro Jaimes
Findings of the Association for Computational Linguistics: EMNLP 2022

Social media has increasingly played a key role in emergency response: first responders can use public posts to better react to ongoing crisis events and deploy the necessary resources where they are most needed. Timeline extraction and abstractive summarization are critical technical tasks to leverage large numbers of social media posts about events. Unfortunately, there are few datasets for benchmarking technical approaches for those tasks. This paper presents , the largest dataset of local crisis event timelines available to date. contains 1,000 crisis event timelines across four domains: wildfires, local fires, traffic, and storms. We built using a semi-automated cluster-then-refine approach to collect data from the public Twitter stream. Our initial experiments indicate a significant gap between the performance of strong baselines compared to the human performance on both tasks.Our dataset, code, and models are publicly available (


GTN-ED: Event Detection Using Graph Transformer Networks
Sanghamitra Dutta | Liang Ma | Tanay Kumar Saha | Di Liu | Joel Tetreault | Alejandro Jaimes
Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)

Recent works show that the graph structure of sentences, generated from dependency parsers, has potential for improving event detection. However, they often only leverage the edges (dependencies) between words, and discard the dependency labels (e.g., nominal-subject), treating the underlying graph edges as homogeneous. In this work, we propose a novel framework for incorporating both dependencies and their labels using a recently proposed technique called Graph Transformer Network (GTN). We integrate GTN to leverage dependency relations on two existing homogeneous-graph-based models and demonstrate an improvement in the F1 score on the ACE dataset.

A Novel Framework for Detecting Important Subevents from Crisis Events via Dynamic Semantic Graphs
Evangelia Spiliopoulou | Tanay Kumar Saha | Joel Tetreault | Alejandro Jaimes
Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)

Social media is an essential tool to share information about crisis events, such as natural disasters. Event Detection aims at extracting information in the form of an event, but considers each event in isolation, without combining information across sentences or events. Many posts in Crisis NLP contain repetitive or complementary information which needs to be aggregated (e.g., the number of trapped people and their location) for disaster response. Although previous approaches in Crisis NLP aggregate information across posts, they only use shallow representations of the content (e.g., keywords), which cannot adequately represent the semantics of a crisis event and its sub-events. In this work, we propose a novel framework to extract critical sub-events from a large-scale crisis event by combining important information across relevant tweets. Our framework first converts all the tweets from a crisis event into a temporally-ordered set of graphs. Then it extracts sub-graphs that represent semantic relationships connecting verbs and nouns in 3 to 6 node sub-graphs. It does this by learning edge weights via Dynamic Graph Convolutional Networks (DGCNs) and extracting smaller, relevant sub-graphs. Our experiments show that our extracted structures (1) are semantically meaningful sub-events and (2) contain information important for the large crisis-event. Furthermore, we show that our approach significantly outperforms event detection baselines, highlighting the importance of aggregating information across tweets for our task.

Journalistic Guidelines Aware News Image Captioning
Xuewen Yang | Svebor Karaman | Joel Tetreault | Alejandro Jaimes
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

The task of news article image captioning aims to generate descriptive and informative captions for news article images. Unlike conventional image captions that simply describe the content of the image in general terms, news image captions follow journalistic guidelines and rely heavily on named entities to describe the image content, often drawing context from the whole article they are associated with. In this work, we propose a new approach to this task, motivated by caption guidelines that journalists follow. Our approach, Journalistic Guidelines Aware News Image Captioning (JoGANIC), leverages the structure of captions to improve the generation quality and guide our representation design. Experimental results, including detailed ablation studies, on two large-scale publicly available datasets show that JoGANIC substantially outperforms state-of-the-art methods both on caption generation and named entity related metrics.


pdf bib
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events
Claire Bonial | Tommaso Caselli | Snigdha Chaturvedi | Elizabeth Clark | Ruihong Huang | Mohit Iyyer | Alejandro Jaimes | Heng Ji | Lara J. Martin | Ben Miller | Teruko Mitamura | Nanyun Peng | Joel Tetreault
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events


Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest
Dragomir Radev | Amanda Stent | Joel Tetreault | Aasish Pappu | Aikaterini Iliakopoulou | Agustin Chanfreau | Paloma de Juan | Jordi Vallmitjana | Alejandro Jaimes | Rahul Jha | Robert Mankoff
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The New Yorker publishes a weekly captionless cartoon. More than 5,000 readers submit captions for it. The editors select three of them and ask the readers to pick the funniest one. We describe an experiment that compares a dozen automatic methods for selecting the funniest caption. We show that negative sentiment, human-centeredness, and lexical centrality most strongly match the funniest captions, followed by positive sentiment. These results are useful for understanding humor and also in the design of more engaging conversational agents in text and multimodal (vision+text) systems. As part of this work, a large set of cartoons and captions is being made available to the community.