Rami El-Wazzi

2026

Bias Mitigation in Hiring-Related NLP: Interactions Between Masking, Rewriting, and Adversarial Debiasing
Alexandre Puttick | Rami El-Wazzi
Proceedings of the 6th International Conference on Natural Language Processing for the Digital Humanities

AI-driven language technologies are increasingly used in hiring, but they may encode and reproduce harmful social stereotypes. Prior work often studies bias mitigation methods in isolation and outside realistic application settings. We examine the combined effects of data-level and model-level debiasing in a hiring-related context, using Norwegian-language academic bios and a proxy STEM/non-STEM classification task. Specifically, we study masking sensitive information, GenWriter-based rewrites (CITATION), and adversarial debiasing (CITATION). We evaluate these interventions using downstream task performance, group fairness metrics, intrinsic bias tests based on WEAT (CITATION), and measures of gender leakage from hidden representations. We find that combining masking, GenWriter rewrites, and adversarial debiasing substantially reduces gender leakage while maintaining or improving downstream performance. However, effects on fairness gaps and intrinsic bias are mixed, underscoring the need for downstream, context-sensitive evaluation of bias mitigation methods in hiring-related NLP.

Co-authors

Alexandre Puttick 1

Venues

NLP4DH1
WS1

Fix author