Should We Ban English NLP for a Year?

Anders Søgaard


Abstract
Around two thirds of NLP research at top venues is devoted exclusively to developing technology for speakers of English, most speech data comes from young urban speakers, and most texts used to train language models come from male writers. These biases feed into consumer technologies to widen existing inequality gaps, not only within, but also across, societies. Many have argued that it is almost impossible to mitigate inequality amplification. I argue that, on the contrary, it is quite simple to do so, and that counter-measures would have little-to-no negative impact, except for, perhaps, in the very short term.
Anthology ID:
2022.emnlp-main.351
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5254–5260
Language:
URL:
https://aclanthology.org/2022.emnlp-main.351
DOI:
10.18653/v1/2022.emnlp-main.351
Bibkey:
Cite (ACL):
Anders Søgaard. 2022. Should We Ban English NLP for a Year?. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5254–5260, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Should We Ban English NLP for a Year? (Søgaard, EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-2023-videos/2022.emnlp-main.351.pdf