Nikta Gohari Sadr

2026

ZIP: Quantifying Which Words Matter in Zero-Shot Instructional Prompts
Nikta Gohari Sadr | Sangmitra Madhusudan | Arash Asgari | Hassan Sajjad | Laleh Seyyed-Kalantari | Ali Emami
Proceedings of the 15th Joint Conference on Lexical and Computational Semantics (*SEM 2026)

While zero-shot instructional prompts like "Let’s think step-by-step” have revolutionized Large Language Model performance, we lack systematic understanding of why: which specific words drive their effectiveness, and how do these patterns vary across tasks and models? We introduce the ZIP score (Zero-shot Importance of Perturbation), a metric that quantifies individual word importance through controlled, semantically meaningful perturbations. To enable rigorous evaluation, we also introduce the first ground-truth benchmark for prompt interpretability, a set of validation prompts with predetermined keywords where ZIP achieves 95.8% accuracy compared to 65.8% for LIME. Analyzing six flagship models across seven prompts and multiple task domains, we find that word importance is task-dependent ("step-by-step” dominates mathematical reasoning; "think” matters more for common-sense tasks), varies systematically across model families, and correlates inversely with model performance, suggesting prompts have greatest impact on tasks where models struggle. Our findings advance prompt science, providing both practical guidance for prompt engineering and theoretical understanding of how instructional language shapes model behavior.

2025

pdf bib abs

Fine-Tuned LLMs are “Time Capsules” for Tracking Societal Bias Through Books
Sangmitra Madhusudan | Robert Morabito | Skye Reid | Nikta Gohari Sadr | Ali Emami
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Books, while often rich in cultural insights, can also mirror societal biases of their eras—biases that Large Language Models (LLMs) may learn and perpetuate during training. We introduce a novel method to trace and quantify these biases using fine-tuned LLMs. We develop BookPAGE, a corpus comprising 593 fictional books across seven decades (1950-2019), to track bias evolution. By fine-tuning LLMs on books from each decade and using targeted prompts, we examine shifts in biases related to gender, sexual orientation, race, and religion. Our findings indicate that LLMs trained on decade-specific books manifest biases reflective of their times, with both gradual trends and notable shifts. For example, model responses showed a progressive increase in the portrayal of women in leadership roles (from 8% to 22%) from the 1950s to 2010s, with a significant uptick in the 1990s (from 4% to 12%), possibly aligning with third-wave feminism. Same-sex relationship references increased markedly from the 1980s to 2000s (from 0% to 10%), mirroring growing LGBTQ+ visibility. Concerningly, negative portrayals of Islam rose sharply in the 2000s (26% to 38%), likely reflecting post-9/11 sentiments. Importantly, we demonstrate that these biases stem mainly from the books’ content and not the models’ architecture or initial training. Our study offers a new perspective on societal bias trends by bridging AI, literary studies, and social science research.

pdf bib abs

We Politely Insist: Your LLM Must Learn the Persian Art of Taarof
Nikta Gohari Sadr | Sahar Heidariasl | Karine Megerdoomian | Laleh Seyyed-Kalantari | Ali Emami
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) struggle to navigate culturally specific communication norms, limiting their effectiveness in global contexts. We focus on Persian *taarof*, a social norm in Iranian interactions, which is a sophisticated system of ritual politeness that emphasizes deference, modesty, and indirectness, yet remains absent from existing cultural benchmarks. We introduce **TaarofBench**, the first benchmark for evaluating LLM understanding of taarof, comprising 450 role-play scenarios covering 12 common social interaction topics, validated by native speakers. Our evaluation of five frontier LLMs reveals substantial gaps in cultural competence, with accuracy rates 40-48% below native speakers when taarof is culturally appropriate. Performance varies between interaction topics, improves with Persian-language prompts, and exhibits gender-based asymmetries. We also show that responses rated “polite” by standard metrics often violate taarof norms, indicating the limitations of Western politeness frameworks. Through supervised fine-tuning and Direct Preference Optimization, we achieve 21.8% and 42.3% improvement in model alignment with cultural expectations. Our human study with 33 participants (11 native Persian, 11 heritage, and 11 non-Iranian speakers) forms baselines in varying degrees of familiarity with Persian norms. This work lays the foundation for developing diverse and culturally aware LLMs, enabling applications that better navigate complex social interactions.

Co-authors

Karine Megerdoomian 1

Robert Morabito 1

Skye Reid 1

Hassan Sajjad 1

Venues

Fix author