Fair Is Better than Sensational: Man Is to Doctor as Woman Is to Doctor

Malvina Nissim, Rik van Noord, Rob van der Goot


Abstract
Analogies such as man is to king as woman is to X are often used to illustrate the amazing power of word embeddings. Concurrently, they have also been used to expose how strongly human biases are encoded in vector spaces trained on natural language, with examples like man is to computer programmer as woman is to homemaker. Recent work has shown that analogies are in fact not an accurate diagnostic for bias, but this does not mean that they are not used anymore, or that their legacy is fading. Instead of focusing on the intrinsic problems of the analogy task as a bias detection tool, we discuss a series of issues involving implementation as well as subjective choices that might have yielded a distorted picture of bias in word embeddings. We stand by the truth that human biases are present in word embeddings, and, of course, the need to address them. But analogies are not an accurate tool to do so, and the way they have been most often used has exacerbated some possibly non-existing biases and perhaps hidden others. Because they are still widely popular, and some of them have become classics within and outside the NLP community, we deem it important to provide a series of clarifications that should put well-known, and potentially new analogies, into the right perspective.
Anthology ID:
2020.cl-2.7
Volume:
Computational Linguistics, Volume 46, Issue 2 - June 2020
Month:
June
Year:
2020
Address:
Venue:
CL
SIG:
Publisher:
Note:
Pages:
487–497
Language:
URL:
https://aclanthology.org/2020.cl-2.7
DOI:
10.1162/coli_a_00379
Bibkey:
Cite (ACL):
Malvina Nissim, Rik van Noord, and Rob van der Goot. 2020. Fair Is Better than Sensational: Man Is to Doctor as Woman Is to Doctor. Computational Linguistics, 46(2):487–497.
Cite (Informal):
Fair Is Better than Sensational: Man Is to Doctor as Woman Is to Doctor (Nissim et al., CL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.cl-2.7.pdf