Curation of Benchmark Templates for Measuring Gender Bias in Named Entity Recognition Models

Ana Cimitan, Ana Alves Pinto, Michaela Geierhos


Abstract
Named Entity Recognition (NER) constitutes a popular machine learning technique that empowers several natural language processing applications. As with other machine learning applications, NER models have been shown to be susceptible to gender bias. The latter is often assessed using benchmark datasets, which in turn are curated specifically for a given Natural Language Processing (NLP) task. In this work, we investigate the robustness of benchmark templates to detect gender bias and propose a novel method to improve the curation of such datasets. The method, based on masked token prediction, aims to filter out benchmark templates with a higher probability of detecting gender bias in NER models. We tested the method for English and German, using the corresponding fine-tuned BERT base model (cased) as the NER model. The gender gaps detected with templates classified as appropriate by the method were statistically larger than those detected with inappropriate templates. The results were similar for both languages and support the use of the proposed method in the curation of templates designed to detect gender bias.
Anthology ID:
2024.lrec-main.378
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
4238–4246
Language:
URL:
https://aclanthology.org/2024.lrec-main.378
DOI:
Bibkey:
Cite (ACL):
Ana Cimitan, Ana Alves Pinto, and Michaela Geierhos. 2024. Curation of Benchmark Templates for Measuring Gender Bias in Named Entity Recognition Models. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 4238–4246, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Curation of Benchmark Templates for Measuring Gender Bias in Named Entity Recognition Models (Cimitan et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2024.lrec-main.378.pdf