Muhammad Suhaib Rashid
2026
With a Grain of SALT: Are LLMs Fair Across Social Dimensions?
Samee Arif | Zohaib Khan | Maaidah Kaleem Butt | Muhammad Suhaib Rashid | Agha Ali Raza | Awais Athar
Proceedings of the 6th Workshop on Trustworthy NLP (TrustNLP 2026)
Samee Arif | Zohaib Khan | Maaidah Kaleem Butt | Muhammad Suhaib Rashid | Agha Ali Raza | Awais Athar
Proceedings of the 6th Workshop on Trustworthy NLP (TrustNLP 2026)
In this paper we present a systematic study of social bias in small- to mid-scale Large Language Models (LLMs), focusing on gender, religion, and race. Using our SALT (Social Appropriateness in LLM Text) dataset, we explore two bias categories—Theoretical and Practical. Theoretical bias covers General Debate and Positioned Debate while practical bias includes Career Advice, Personal Advice, and Resume Generation. We quantify bias using win-rate gaps in general debate, and negative-role assignments in positioned debate. For Practical bias, we anonymize model outputs to remove explicit demographic cues and use DeepSeek-R1 as an automated evaluator, measuring outcome disparities across groups. We also examine systemic issues in LLM-based evaluation including evaluation bias, positional bias, and length bias and validate our findings through human annotation. Our results show consistent disadvantages for White, Christian, and male-associated outputs across multiple tasks. Larger models often amplify these disparities, highlighting that scale does not guarantee fairness.