Gender-Neutral Large Language Models for Medical Applications: Reducing Bias in PubMed Abstracts

Elizabeth Schaefer; Kirk Roberts

Gender-Neutral Large Language Models for Medical Applications: Reducing Bias in PubMed Abstracts

Abstract

This paper presents a pipeline for mitigating gender bias in large language models (LLMs) used in medical literature by neutralizing gendered occupational pronouns. A set of 379,000 PubMed abstracts from 1965-1980 was processed to identify and modify pronouns tied to professions. We developed a BERT-based model, Modern Occupational Bias Elimination with Refined Training, or MOBERT, trained on these neutralized abstracts, and compared it with 1965BERT, trained on the original dataset. MOBERT achieved a 70% inclusive replacement rate, while 1965BERT reached only 4%. A further analysis of MOBERT revealed that pronoun replacement accuracy correlated with the frequency of occupational terms in the training data. We propose expanding the dataset and refining the pipeline to improve performance and ensure more equitable language modeling in medical applications.

Anthology ID:: 2025.bionlp-1.11
Volume:: ACL 2025
Month:: August
Year:: 2025
Address:: Viena, Austria
Editors:: Dina Demner-Fushman, Sophia Ananiadou, Makoto Miwa, Junichi Tsujii
Venues:: BioNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 114–123
Language:
URL:: https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bionlp-1.11/
DOI:
Bibkey:
Cite (ACL):: Elizabeth Schaefer and Kirk Roberts. 2025. Gender-Neutral Large Language Models for Medical Applications: Reducing Bias in PubMed Abstracts. In ACL 2025, pages 114–123, Viena, Austria. Association for Computational Linguistics.
Cite (Informal):: Gender-Neutral Large Language Models for Medical Applications: Reducing Bias in PubMed Abstracts (Schaefer & Roberts, BioNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bionlp-1.11.pdf
Supplementarymaterial:: 2025.bionlp-1.11.SupplementaryMaterial.zip
Supplementarymaterial:: 2025.bionlp-1.11.SupplementaryMaterial.txt

PDF Cite Search Supplementarymaterial Supplementarymaterial Fix data