Do Prevalent Bias Metrics Capture Allocational Harms from LLMs?

Hannah Cyberey, Yangfeng Ji, David Evans


Abstract
Allocational harms occur when resources or opportunities are unfairly withheld from specific groups. Many proposed bias measures ignore the discrepancy between predictions, which are what the proposed methods consider, and decisions that are made as a result of those predictions. Our work examines the reliability of current bias metrics in assessing allocational harms arising from predictions of large language models (LLMs). We evaluate their predictive validity and utility for model selection across ten LLMs and two allocation tasks. Our results reveal that commonly-used bias metrics based on average performance gap and distribution distance fail to reliably capture group disparities in allocation outcomes. Our work highlights the need to account for how model predictions are used in decisions, in particular in contexts where they are influenced by how limited resources are allocated.
Anthology ID:
2025.insights-1.5
Volume:
The Sixth Workshop on Insights from Negative Results in NLP
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Aleksandr Drozd, João Sedoc, Shabnam Tafreshi, Arjun Akula, Raphael Shu
Venues:
insights | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
34–45
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.insights-1.5/
DOI:
Bibkey:
Cite (ACL):
Hannah Cyberey, Yangfeng Ji, and David Evans. 2025. Do Prevalent Bias Metrics Capture Allocational Harms from LLMs?. In The Sixth Workshop on Insights from Negative Results in NLP, pages 34–45, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Do Prevalent Bias Metrics Capture Allocational Harms from LLMs? (Cyberey et al., insights 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.insights-1.5.pdf