Pipelines for Social Bias Testing of Large Language Models

Debora Nozza, Federico Bianchi, Dirk Hovy


Abstract
The maturity level of language models is now at a stage in which many companies rely on them to solve various tasks. However, while research has shown how biased and harmful these models are, systematic ways of integrating social bias tests into development pipelines are still lacking. This short paper suggests how to use these verification techniques in development pipelines. We take inspiration from software testing and suggest addressing social bias evaluation as software testing. We hope to open a discussion on the best methodologies to handle social bias testing in language models.
Anthology ID:
2022.bigscience-1.6
Volume:
Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models
Month:
May
Year:
2022
Address:
virtual+Dublin
Editors:
Angela Fan, Suzana Ilic, Thomas Wolf, Matthias Gallé
Venue:
BigScience
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
68–74
Language:
URL:
https://aclanthology.org/2022.bigscience-1.6
DOI:
10.18653/v1/2022.bigscience-1.6
Bibkey:
Cite (ACL):
Debora Nozza, Federico Bianchi, and Dirk Hovy. 2022. Pipelines for Social Bias Testing of Large Language Models. In Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models, pages 68–74, virtual+Dublin. Association for Computational Linguistics.
Cite (Informal):
Pipelines for Social Bias Testing of Large Language Models (Nozza et al., BigScience 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2022.bigscience-1.6.pdf
Video:
 https://preview.aclanthology.org/emnlp-22-attachments/2022.bigscience-1.6.mp4