Measuring Gender Bias in Natural Language Processing: Incorporating Gender-Neutral Linguistic Forms for Non-Binary Gender Identities in Abusive Speech Detection

Nasim Sobhani; Kinshuk Sengupta; Sarah Jane Delany

Measuring Gender Bias in Natural Language Processing: Incorporating Gender-Neutral Linguistic Forms for Non-Binary Gender Identities in Abusive Speech Detection

Nasim Sobhani, Kinshuk Sengupta, Sarah Jane Delany

Abstract

Predictions from machine learning models can reflect bias in the data on which they are trained. Gender bias has been shown to be prevalent in natural language processing models. The research into identifying and mitigating gender bias in these models predominantly considers gender as binary, male and female, neglecting the fluidity and continuity of gender as a variable. In this paper, we present an approach to evaluate gender bias in a prediction task, which recognises the non-binary nature of gender. We gender-neutralise a random subset of existing real-world hate speech data. We extend the existing template approach for measuring gender bias to include test examples that are gender-neutral. Measuring the bias across a selection of hate speech datasets we show that the bias for the gender-neutral data is closer to that seen for test instances that identify as male than those that identify as female.

Anthology ID:: 2023.ranlp-1.119
Volume:: Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Month:: September
Year:: 2023
Address:: Varna, Bulgaria
Editors:: Ruslan Mitkov, Galia Angelova
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:: 1121–1131
Language:
URL:: https://aclanthology.org/2023.ranlp-1.119
DOI:
Bibkey:
Cite (ACL):: Nasim Sobhani, Kinshuk Sengupta, and Sarah Jane Delany. 2023. Measuring Gender Bias in Natural Language Processing: Incorporating Gender-Neutral Linguistic Forms for Non-Binary Gender Identities in Abusive Speech Detection. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 1121–1131, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):: Measuring Gender Bias in Natural Language Processing: Incorporating Gender-Neutral Linguistic Forms for Non-Binary Gender Identities in Abusive Speech Detection (Sobhani et al., RANLP 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-1/2023.ranlp-1.119.pdf

PDF Search