Pride Kavumba


2021

pdf bib
Learning to Learn to be Right for the Right Reasons
Pride Kavumba | Benjamin Heinzerling | Ana Brassard | Kentaro Inui
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Improving model generalization on held-out data is one of the core objectives in common- sense reasoning. Recent work has shown that models trained on the dataset with superficial cues tend to perform well on the easy test set with superficial cues but perform poorly on the hard test set without superficial cues. Previous approaches have resorted to manual methods of encouraging models not to overfit to superficial cues. While some of the methods have improved performance on hard instances, they also lead to degraded performance on easy in- stances. Here, we propose to explicitly learn a model that does well on both the easy test set with superficial cues and the hard test set without superficial cues. Using a meta-learning objective, we learn such a model that improves performance on both the easy test set and the hard test set. By evaluating our models on Choice of Plausible Alternatives (COPA) and Commonsense Explanation, we show that our proposed method leads to improved performance on both the easy test set and the hard test set upon which we observe up to 16.5 percentage points improvement over the baseline.

2019

pdf bib
When Choosing Plausible Alternatives, Clever Hans can be Clever
Pride Kavumba | Naoya Inoue | Benjamin Heinzerling | Keshav Singh | Paul Reisert | Kentaro Inui
Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing

Pretrained language models, such as BERT and RoBERTa, have shown large improvements in the commonsense reasoning benchmark COPA. However, recent work found that many improvements in benchmarks of natural language understanding are not due to models learning the task, but due to their increasing ability to exploit superficial cues, such as tokens that occur more often in the correct answer than the wrong one. Are BERT’s and RoBERTa’s good performance on COPA also caused by this? We find superficial cues in COPA, as well as evidence that BERT exploits these cues.To remedy this problem, we introduce Balanced COPA, an extension of COPA that does not suffer from easy-to-exploit single token cues. We analyze BERT’s and RoBERTa’s performance on original and Balanced COPA, finding that BERT relies on superficial cues when they are present, but still achieves comparable performance once they are made ineffective, suggesting that BERT learns the task to a certain degree when forced to. In contrast, RoBERTa does not appear to rely on superficial cues.

pdf bib
Improving Evidence Detection by Leveraging Warrants
Keshav Singh | Paul Reisert | Naoya Inoue | Pride Kavumba | Kentaro Inui
Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)

Recognizing the implicit link between a claim and a piece of evidence (i.e. warrant) is the key to improving the performance of evidence detection. In this work, we explore the effectiveness of automatically extracted warrants for evidence detection. Given a claim and candidate evidence, our proposed method extracts multiple warrants via similarity search from an existing, structured corpus of arguments. We then attentively aggregate the extracted warrants, considering the consistency between the given argument and the acquired warrants. Although a qualitative analysis on the warrants shows that the extraction method needs to be improved, our results indicate that our method can still improve the performance of evidence detection.