Abstract
Disparities in authorship and citations across gender can have substantial adverse consequences not just on the disadvantaged genders, but also on the field of study as a whole. Measuring gender gaps is a crucial step towards addressing them. In this work, we examine female first author percentages and the citations to their papers in Natural Language Processing (1965 to 2019). We determine aggregate-level statistics using an existing manually curated author–gender list as well as first names strongly associated with a gender. We find that only about 29% of first authors are female and only about 25% of last authors are female. Notably, this percentage has not improved since the mid 2000s. We also show that, on average, female first authors are cited less than male first authors, even when controlling for experience and area of research. Finally, we discuss the ethical considerations involved in automatic demographic analysis.- Anthology ID:
- 2020.acl-main.702
- Original:
- 2020.acl-main.702v1
- Version 2:
- 2020.acl-main.702v2
- Volume:
- Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7860–7870
- Language:
- URL:
- https://aclanthology.org/2020.acl-main.702
- DOI:
- 10.18653/v1/2020.acl-main.702
- Cite (ACL):
- Saif M. Mohammad. 2020. Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7860–7870, Online. Association for Computational Linguistics.
- Cite (Informal):
- Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations (Mohammad, ACL 2020)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2020.acl-main.702.pdf