Do PLMs and Annotators Share the Same Gender Bias? Definition, Dataset, and Framework of Contextualized Gender Bias
Shucheng Zhu, Bingjie Du, Jishun Zhao, Ying Liu, Pengyuan Liu
Abstract
Pre-trained language models (PLMs) have achieved success in various of natural language processing (NLP) tasks. However, PLMs also introduce some disquieting safety problems, such as gender bias. Gender bias is an extremely complex issue, because different individuals may hold disparate opinions on whether the same sentence expresses harmful bias, especially those seemingly neutral or positive. This paper first defines the concept of contextualized gender bias (CGB), which makes it easy to measure implicit gender bias in both PLMs and annotators. We then construct CGBDataset, which contains 20k natural sentences with gendered words, from Chinese news. Similar to the task of masked language models, gendered words are masked for PLMs and annotators to judge whether a male word or a female word is more suitable. Then, we introduce CGBFrame to measure the gender bias of annotators. By comparing the results measured by PLMs and annotators, we find that though there are differences on the choices made by PLMs and annotators, they show significant consistency in general.- Anthology ID:
- 2024.gebnlp-1.2
- Volume:
- Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Agnieszka Faleńska, Christine Basta, Marta Costa-jussà, Seraphina Goldfarb-Tarrant, Debora Nozza
- Venues:
- GeBNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 20–32
- Language:
- URL:
- https://aclanthology.org/2024.gebnlp-1.2
- DOI:
- Cite (ACL):
- Shucheng Zhu, Bingjie Du, Jishun Zhao, Ying Liu, and Pengyuan Liu. 2024. Do PLMs and Annotators Share the Same Gender Bias? Definition, Dataset, and Framework of Contextualized Gender Bias. In Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 20–32, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- Do PLMs and Annotators Share the Same Gender Bias? Definition, Dataset, and Framework of Contextualized Gender Bias (Zhu et al., GeBNLP-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.gebnlp-1.2.pdf