Measuring Bias in Contextualized Word Representations
Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W Black, Yulia Tsvetkov
Abstract
Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks. Since they are optimized to capture the statistical properties of training data, they tend to pick up on and amplify social stereotypes present in the data as well. In this study, we (1) propose a template-based method to quantify bias in BERT; (2) show that this method obtains more consistent results in capturing social biases than the traditional cosine based method; and (3) conduct a case study, evaluating gender bias in a downstream task of Gender Pronoun Resolution. Although our case study focuses on gender bias, the proposed technique is generalizable to unveiling other biases, including in multiclass settings, such as racial and religious biases.- Anthology ID:
- W19-3823
- Volume:
- Proceedings of the First Workshop on Gender Bias in Natural Language Processing
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Marta R. Costa-jussà, Christian Hardmeier, Will Radford, Kellie Webster
- Venue:
- GeBNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 166–172
- Language:
- URL:
- https://aclanthology.org/W19-3823
- DOI:
- 10.18653/v1/W19-3823
- Cite (ACL):
- Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W Black, and Yulia Tsvetkov. 2019. Measuring Bias in Contextualized Word Representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pages 166–172, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Measuring Bias in Contextualized Word Representations (Kurita et al., GeBNLP 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/W19-3823.pdf