Monolingual and Multilingual Reduction of Gender Bias in Contextualized Representations

Sheng Liang, Philipp Dufter, Hinrich Schütze


Abstract
Pretrained language models (PLMs) learn stereotypes held by humans and reflected in text from their training corpora, including gender bias. When PLMs are used for downstream tasks such as picking candidates for a job, people’s lives can be negatively affected by these learned stereotypes. Prior work usually identifies a linear gender subspace and removes gender information by eliminating the subspace. Following this line of work, we propose to use DensRay, an analytical method for obtaining interpretable dense subspaces. We show that DensRay performs on-par with prior approaches, but provide arguments that it is more robust and provide indications that it preserves language model performance better. By applying DensRay to attention heads and layers of BERT we show that gender information is spread across all attention heads and most of the layers. Also we show that DensRay can obtain gender bias scores on both token and sentence levels. Finally, we demonstrate that we can remove bias multilingually, e.g., from Chinese, using only English training data.
Anthology ID:
2020.coling-main.446
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5082–5093
Language:
URL:
https://aclanthology.org/2020.coling-main.446
DOI:
10.18653/v1/2020.coling-main.446
Bibkey:
Cite (ACL):
Sheng Liang, Philipp Dufter, and Hinrich Schütze. 2020. Monolingual and Multilingual Reduction of Gender Bias in Contextualized Representations. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5082–5093, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Monolingual and Multilingual Reduction of Gender Bias in Contextualized Representations (Liang et al., COLING 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2020.coling-main.446.pdf
Code
 liangsheng02/densray-debiasing
Data
GLUEWikiText-2