Sociodemographic Bias in Language Models: A Survey and Forward Path

Vipul Gupta, Pranav Narayanan Venkit, Shomir Wilson, Rebecca Passonneau


Abstract
Sociodemographic bias in language models (LMs) has the potential for harm when deployed in real-world settings. This paper presents a comprehensive survey of the past decade of research on sociodemographic bias in LMs, organized into a typology that facilitates examining the different aims: types of bias, quantifying bias, and debiasing techniques. We track the evolution of the latter two questions, then identify current trends and their limitations, as well as emerging techniques. To guide future research towards more effective and reliable solutions, and to help authors situate their work within this broad landscape, we conclude with a checklist of open questions.
Anthology ID:
2024.gebnlp-1.19
Volume:
Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Agnieszka Faleńska, Christine Basta, Marta Costa-jussà, Seraphina Goldfarb-Tarrant, Debora Nozza
Venues:
GeBNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
295–322
Language:
URL:
https://preview.aclanthology.org/add-orcids-2024-emnlp/2024.gebnlp-1.19/
DOI:
10.18653/v1/2024.gebnlp-1.19
Bibkey:
Cite (ACL):
Vipul Gupta, Pranav Narayanan Venkit, Shomir Wilson, and Rebecca Passonneau. 2024. Sociodemographic Bias in Language Models: A Survey and Forward Path. In Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 295–322, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Sociodemographic Bias in Language Models: A Survey and Forward Path (Gupta et al., GeBNLP 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/add-orcids-2024-emnlp/2024.gebnlp-1.19.pdf