2025
pdf
bib
abs
Jailbreak Distillation: Renewable Safety Benchmarking
Jingyu Zhang
|
Ahmed Elgohary
|
Xiawei Wang
|
A S M Iftekhar
|
Ahmed Magooda
|
Benjamin Van Durme
|
Daniel Khashabi
|
Kyle Jackson
Findings of the Association for Computational Linguistics: EMNLP 2025
Large language models (LLMs) are rapidly deployed in critical applications, raising urgent needs for robust safety benchmarking. We propose Jailbreak Distillation (JBDistill), a novel benchmark construction framework that “distills” jailbreak attacks into high-quality and easily-updatable safety benchmarks. JBDistill utilizes a small set of development models and existing jailbreak attack algorithms to create a candidate prompt pool, then employs prompt selection algorithms to identify an effective subset of prompts as safety benchmarks. JBDistill addresses challenges in existing safety evaluation: the use of consistent evaluation prompts across models ensures fair comparisons and reproducibility. It requires minimal human effort to rerun the JBDistill pipeline and produce updated benchmarks, alleviating concerns on saturation and contamination. Extensive experiments demonstrate our benchmarks generalize robustly to 13 diverse evaluation models held out from benchmark construction, including proprietary, specialized, and newer-generation LLMs, significantly outperforming existing safety benchmarks in effectiveness while maintaining high separability and diversity. Our framework thus provides an effective, sustainable, and adaptable solution for streamlining safety evaluation.
2024
pdf
bib
abs
Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking
Mohamed Elaraby
|
Diane Litman
|
Xiang Lorraine Li
|
Ahmed Magooda
Findings of the Association for Computational Linguistics: EMNLP 2024
Generating free-text rationales is among the emergent capabilities of Large Language Models (LLMs). These rationales have been found to enhance LLM performance across various NLP tasks. Recently, there has been growing interest in using these rationales to provide insights for various important downstream tasks. In this paper, we analyze generated free-text rationales in tasks with subjective answers, emphasizing the importance of rationalization in such scenarios. We focus on pairwise argument ranking, a highly subjective task with significant potential for real-world applications, such as debate assistance. We evaluate the persuasiveness of rationales generated by nine LLMs to support their subjective choices. Our findings suggest that open-source LLMs, particularly Llama2-70B-chat, are capable of providing highly persuasive rationalizations, surpassing even GPT models. Additionally, our experiments demonstrate that the persuasiveness of the generated rationales can be enhanced by guiding their persuasive elements through prompting or self-refinement techniques.
2021
pdf
bib
abs
Exploring Multitask Learning for Low-Resource Abstractive Summarization
Ahmed Magooda
|
Diane Litman
|
Mohamed Elaraby
Findings of the Association for Computational Linguistics: EMNLP 2021
This paper explores the effect of using multitask learning for abstractive summarization in the context of small training corpora. In particular, we incorporate four different tasks (extractive summarization, language modeling, concept detection, and paraphrase detection) both individually and in combination, with the goal of enhancing the target task of abstractive summarization via multitask learning. We show that for many task combinations, a model trained in a multitask setting outperforms a model trained only for abstractive summarization, with no additional summarization data introduced. Additionally, we do a comprehensive search and find that certain tasks (e.g. paraphrase detection) consistently benefit abstractive summarization, not only when combined with other tasks but also when using different architectures and training corpora.
pdf
bib
abs
Mitigating Data Scarceness through Data Synthesis, Augmentation and Curriculum for Abstractive Summarization
Ahmed Magooda
|
Diane Litman
Findings of the Association for Computational Linguistics: EMNLP 2021
This paper explores three simple data manipulation techniques (synthesis, augmentation, curriculum) for improving abstractive summarization models without the need for any additional data. We introduce a method of data synthesis with paraphrasing, a data augmentation technique with sample mixing, and curriculum learning with two new difficulty metrics based on specificity and abstractiveness. We conduct experiments to show that these three techniques can help improve abstractive summarization across two summarization models and two different small datasets. Furthermore, we show that these techniques can improve performance when applied in isolation and when combined.
2016
pdf
bib
RDI_Team at SemEval-2016 Task 3: RDI Unsupervised Framework for Text Ranking
Ahmed Magooda
|
Amr Gomaa
|
Ashraf Mahgoub
|
Hany Ahmed
|
Mohsen Rashwan
|
Hazem Raafat
|
Eslam Kamal
|
Ahmad Al Sallab
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)