WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning
Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, Michael Gertz
Abstract
Fine-tuning biomedical pre-trained language models (BioPLMs) such as BioBERT has become a common practice dominating leaderboards across various natural language processing tasks. Despite their success and wide adoption, prevailing fine-tuning approaches for named entity recognition (NER) naively train BioPLMs on targeted datasets without considering class distributions. This is problematic especially when dealing with imbalanced biomedical gold-standard datasets for NER in which most biomedical entities are underrepresented. In this paper, we address the class imbalance problem and propose WeLT, a cost-sensitive fine-tuning approach based on new re-scaled class weights for the task of biomedical NER. We evaluate WeLT’s fine-tuning performance on mixed-domain and domain-specific BioPLMs using eight biomedical gold-standard datasets. We compare our approach against vanilla fine-tuning and three other existing re-weighting schemes. Our results show the positive impact of handling the class imbalance problem. WeLT outperforms all the vanilla fine-tuned models. Furthermore, our method demonstrates advantages over other existing weighting schemes in most experiments.- Anthology ID:
- 2023.bionlp-1.40
- Volume:
- The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
- Venue:
- BioNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 427–438
- Language:
- URL:
- https://aclanthology.org/2023.bionlp-1.40
- DOI:
- 10.18653/v1/2023.bionlp-1.40
- Cite (ACL):
- Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, and Michael Gertz. 2023. WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 427–438, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning (Mobasher et al., BioNLP 2023)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2023.bionlp-1.40.pdf