Anonymized BERT: An Augmentation Approach to the Gendered Pronoun Resolution Challenge

Bo Liu


Abstract
We present our 7th place solution to the Gendered Pronoun Resolution challenge, which uses BERT without fine-tuning and a novel augmentation strategy designed for contextual embedding token-level tasks. Our method anonymizes the referent by replacing candidate names with a set of common placeholder names. Besides the usual benefits of effectively increasing training data size, this approach diversifies idiosyncratic information embedded in names. Using same set of common first names can also help the model recognize names better, shorten token length, and remove gender and regional biases associated with names. The system scored 0.1947 log loss in stage 2, where the augmentation contributed to an improvements of 0.04. Post-competition analysis shows that, when using different embedding layers, the system scores 0.1799 which would be third place.
Anthology ID:
W19-3818
Volume:
Proceedings of the First Workshop on Gender Bias in Natural Language Processing
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Marta R. Costa-jussà, Christian Hardmeier, Will Radford, Kellie Webster
Venue:
GeBNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
120–125
Language:
URL:
https://aclanthology.org/W19-3818
DOI:
10.18653/v1/W19-3818
Bibkey:
Cite (ACL):
Bo Liu. 2019. Anonymized BERT: An Augmentation Approach to the Gendered Pronoun Resolution Challenge. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pages 120–125, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Anonymized BERT: An Augmentation Approach to the Gendered Pronoun Resolution Challenge (Liu, GeBNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/W19-3818.pdf
Code
 boliu61/gendered-pronoun-resolution
Data
GAP Coreference Dataset