Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs.
Clement Christophe, Tathagata Raha, Svetlana Maslenkova, Muhammad Umar Salman, Praveenkumar Kanithi, Marco AF Pimentel, Shadab Khan
Abstract
Large Language Models (LLMs) have demonstrated significant potential in revolutionizing clinical applications. In this study, we investigate the efficacy of four techniques in adapting LLMs for clinical use-cases: continuous pretraining, instruct fine-tuning, NEFTune, and prompt engineering. We employ these methods on Mistral 7B and Mixtral 8x7B models, leveraging a large-scale clinical pretraining dataset of 50 billion tokens and an instruct fine-tuning dataset of 500 million tokens. Our evaluation across various clinical tasks reveals nuanced insights. While continuous pretraining beyond 250 billion tokens yields marginal improvements, instruct fine-tuning emerges as a more influential factor. Notably, NEFTune, designed primarily to enhance generation quality, surprisingly demonstrates additional gains on our benchmark. These findings underscore the importance of tailoring fine-tuning strategies and exploring innovative techniques to optimize LLM performance in the clinical domain.- Anthology ID:
- 2024.findings-emnlp.618
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2024
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 10549–10561
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.618/
- DOI:
- 10.18653/v1/2024.findings-emnlp.618
- Cite (ACL):
- Clement Christophe, Tathagata Raha, Svetlana Maslenkova, Muhammad Umar Salman, Praveenkumar Kanithi, Marco AF Pimentel, and Shadab Khan. 2024. Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs.. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 10549–10561, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs. (Christophe et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.618.pdf